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NUCLEIC ACIDS ENCODING POLYPEPTIDES 
HAVING PROTEASE ACTIVITY 



Cross-Reference to Related Applications 

This application is a continuation-in-part of pending U.S. application Senal No. 
08/873,479 filed on June 12, 1997, which application is fully incorporated herem by reference. 

Background of the Invention 

Field of the Invention , 

The present invention relates to isolated nueleic acid seqnenees encoding polypeptides 
having protease activity. The invention also relates to nnc.eic acid constructs vectors and host 

polypeptides. 

Description of the Related Art 

Detergents formnlated with proteolytic enzymes are known to have irnproved propemes 
for removing s*ins. For example, SAVINASE™ (Novo Nordisk A/S, Bagsvaerd, Denmark), a 
microbial protease obtained from Bacillus lenius has been introduced into many commercial 
brands of detergent. . . 

WO 88/01293 discloses proteases obtained from an alkalophilic Bacillus spec.es havmg 
enhanced stability towards bleaching agents oftheperoxy type. 

JP H97182 discloses a DNA sequence encoding an alkaline protease Y from Baallus 
whichissaidtohavegoodalkaliandsurfactantresis^ceandimprovesdetergency. 

Many detergent are alkaline in solution («.g., around P H 10). There is a need for new 
„ proteolytic enzymes with high activity at high pH which are stable towards bleaclnng £~ 
Proteases of the type disclosed in WO 88/01293 possess these characteristics, and * ere *° 1 *' 
highly desirable for use in detergent compositions. Heretofore, however, there has been no 
means of producing these enzymes recombinantly. 

It is an object of the present invention to provide for recombmant production of these 

35 valuable enzymes. 
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Summary of the Invention 



The present invention relates to isolated nucleic acid sequences encoding polypeptides 
having protease activity, selected from the group consisting of: ~ 

(a) a nucleic acid sequence encoding a polypeptide having an amino acid sequence 
which has at least 95% identity with the amino acid sequence of SEQ ID N0 43- 

(b) a nucleic acid sequence encoding a polypeptide having an amino' acid sequence 
which has at least 85% identity with the amino acid sequence of SEQ ID N042- 

(c) a nucleic acid sequence having at least 95% homology with the mature polypeptide 
encoding region of the nucleic acid sequence of SEQ ID NO:41 ; 

(d) an allelic variant of (a), (b), or (c); and 

(e) a subsequence of (a), (b), (c), or (d), wherein the subsequence encodes a polypeptide 
fragment which has protease activity. 

The present invention also relates to nucleic acid constructs, vectors, and host cells 
comprising the nucleic acid sequences as well as recombinant methods for producing the 
polypeptides. r B 

Brief Description of the Figures 

Figure 1 shows a restriction map of pShv2. 
Figure 2 shows a restriction map of pSJ1678. 
Figure 3 shows a restriction map of pSJ2882-MCS. 
Figure 4 shows a restriction map of pPL1759. 

Figures 5A and 5B show the nucleic acid sequence and the deduced amino acid 
sequence of a Bacillus JP170 (NCIB 12513) protease gene. 

Figures 6A and 6B show a comparison of the deduced amino acid sequence of a 
Bacillus JP170 (NCIB 12513) protease gene to the deduced amino acid sequences of other 
Droteases 



Figure 7 shows a restriction map of pPL24 19. 
Figure 8 shows a restriction map of pCAsub2. 

Figure 9 shows comparative wash results in a model detergent of Bacillus sp JP170 
protease and SAVINASE™ in removing grass stain from cotton. 

Figure 10 shows comparative wash results in a Koso Top detergent of Bacillus sp 
JP1 70 protease and SAVINASE™ in removing grass stain from cotton. 
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Detailed Description of the Invention 

united Nueleic Add Sequences Encoding Polypeptides Having Protease Activity 

^ ^2 "isoJd nucleic acid sequence" as used herein refers to a nude, acd 
?lch is essentially free of other nucleic acid sequences, e.g. , at least about 20% pure, 
T lttr«/. pure, more preferably at least ahou, 60% pure, even more 
: SO-, P L, and most preferably at least about 90% pure as d_ed 
preferably at leas k fc id sequence ^ be obtained by 

by agarose elechophores,. F« .-*£m ^edn ^ ^ ^ ^ 

from its natural locatio ^ comprising ^ nucleic 

=SSSr===== =r 

nucll acid sequence wi.1 be replicated. The nucieic acid sequence may be of genom.c, cDNA, 
RNA semisynthetic, synthetic origin, or any combinations thereof. 
' TaLndemb^tbeptes^ 

amm0 "tZTlT^S and most preferably at leas, about 97%, which have 

;::rS ST- p— — *- >— - " homo,o8ous 

which diffi by five amino acids, preferably by four amino acids, more preferably by three 
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SEQ ID NO:43. In another preferred embodiment me nucleio acid sequences of the present 

ID N0.43 or a flagmen, thereof, wherein the fragment has protease activity , n a m0 ^ 
preferred embodiment the nucleic acid sequence encodes a polypeptide which has L aTno 
s « s«p 1 enceofSEQ I DNO:42orSEQIDNO:43. TU.JZJ^^Z 

«f ! ™ " ^ SEQ 10 NO:4 1 b ? **» »f *■ ^eneracy of the 

genetic «de. The present invention also relates ,„ subsequences of SEQ ID NO ^Twhfeh 
encode flagmemsofSEQ ,DNO:42 or SEQ ID NO:43 which have protease Activity 
NO-4, exct,T nCe0fSEQIDNO:41 '—^ecidsequenceencompassedby SEQ ID 

P^b v T V " m<>re ^ *" 5 ' 3 ' OTd *™ b ee„ deleted 

Preferably, a subsequence contains a, leas, 1029 nucleotides, more preferably at leas. 1119 

SEQ ID NO.43 is a polypeptide having one or more amino acids deleted from the amino and/o 
> -*«y. .rmmusofthis amino acid sequence. Preferably, a fragment containsT^ 
ammo acid residues, more preferably at least 171 „ -a ■ * 

leas, 403 amino acid residue^. ° ""' "** m ° S ' " referab ^ « 

An allelic variant denotes any of two or more alternative forms of a gene occunvina the 
same chomosoma, locus. Allelic variation arises nati^,y through mutetion^ J d ™ t t 
P^.cp„lym^msmw i d 1 m populaions . Gene mutetions can be sil en, (no change tft 
encoded polypeptide or may encode polypeptides having altered amino acid LptZ £ 
term aU^c vanan, of a polypeptide is a polypeptide encoded by an allelic variant of a gene 
1 he ammo acid sequences of the homologous polypeDtides mav MfF~ I " • 

-no acd resrdues and/or tire substifetion of one or more amino acid residues by dTftCm 
-» -d resrdue, Preferably, amino acid changes a,e of a minor nature, that is jJ^Z 
ammo acd subs timuons that do no. significant* affect me folding and/or activity *2 
ZZ T ddeb0nS - M ° f «« * "»* 30 amino acids; sma,. amino- or 
temunal extensmns, such as an amino-terminal methionine residue- a small linker ne„,M 7 

Examples of conserve substitutions are within me group of basic amino acids (such 
as arguune, lysme and histidine), acidic anino acids (such as glutamic acid and « £f 
polar ammo acids (such as glutamine and asparagi„ e) , hydrophobic ^ 
feucme, .soleucine and valine), aromatic amino acids (such as phenylalanine tiylpta and 
W, and small amino acids (such as gtycine, alanine, serine, threonine ££££ 
Am.no acd subsutupons which do no, generafiy alter the specific activity are known in Zli 
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Press New York. The most commonly occurring exchanges are Ala/Ser, VaWle, Asp Glu, 
Ls'er Ala/Gly, Ala/Ihr, Ser/Asn, Ala/Val, Ser/Gly, Tyr/Phe, Ala/Pro, Lys/Arg, Asp/Asn, 
LeuAleLeuWal,Ala/Glu, and Asp/Gly as well as these in reverse. 

I» a third embodiment, the present invention relates to isolated nuclerc acrd sequences 
which have a degree of homology to the mature polypeptide coding sequence of SEQ ID 
To 41 of at .east about 95% homology, and preferably a, least about 97"/. homology ; whrch 
encode a polypeptide having protease activity; or allelic variants and subsequences of SEQ ID 
NO-41 which encode polypeptide fragments which have protease activty. For purposes of the 
present invention, the degree of homology berween two nucleic acid sequences ts determined 
by the Clustal method (Higgins, 1989, supra) with an identity table, a gap penalty of 10, and a 
gap length penalty ^ ^ relates nucleic acid seque „ces 

encoding polypeptides having protease activity which hybridize under low srrrngency 
, a^^pr-^^-^*-* 1 " ns-arrdmostprnfemWymghstr^ency 
conditions, with an oligonucleotide probe which hybridizes under tire same condmons wnh tire 
nucleic acid sequence of SEQ ID NO;41 or its complementary strand (J. Sambrook, E.F. 
Fritsch and T. Maniatus, 1989, Molecular Cloning, A Laboratory Manual, 2d edmon, Cold 
Spring Harbor, New York); or allelic variants and subsequences of SEQ ID NO;4. whrch 
o encode polypeptide fragments which have protease activity. 

L nucleic acid sequence of SEQ ID NO:41, or a subsequence thereof, as well as tire 
amino acid sequence of SEQ ID NO:42 or SEQ ID NO:43, or a partial sequence thereof , may 
b e used to design an oligonucleotide probe to identity and clone DNA encodmg polyposes 
having protease activity from strains of different genera or species accordmg to methods we 
„ knownmtireart. In particular, such probes can be useti for hybridizauon wrth tire genomrc or 
cDNA of tire genus or species of interest, foHowing standard Soutirem blothng ; procedures m 
2 to identity and isolate tire corresponding gene therein. Such probes can be consrderably 
horter tiran the entire sequence, bu, should be at .east 15, preferably at least 25 and more 
drably at leas. 40 nucleotides in length Longer probes can also be used. Both DNA and 
30 RNAprobescanbeused. The probes are typically labeled for detecting tire correspond,^ gene 
(for example, with J! P, ! H, !S S, biotin, or avidin). 

Thus a genomic, cDNA or combinatorial chemical library prepared from such other 
organisms may be screened for DNA which hybridizes with tire probes described above , and 
wrtich encodes a polypeptide having protease activity. Genomic or other DNA from such other 
3s organisms may be separated by agarose or polyacrylamide gel electrophoresis, or other 
separation techniques. DNA from the libraries or the separated DNA may be transferred to and 
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immobilized on nitrocellulose or other suitable carrier material. In order to identify a clone or 
DNA which is homologous with SEQ ID NO:41, the carrier material is used in a Southern blot 
Hybridization indicates that the nucleic acid sequence hybridizes to the oligonucleotide probe 
corresponding to the polypeptide encoding part of the nucleic acid sequence shown in SEQ ID 
i NO:41, under low to high stringency conditions (i.e., prehybridization and hybridization at 
42°C in 5X SSPE, 0.3% SDS, 200 ug/ml sheared and denatured salmon sperm DNA, and either 
25, 35 or 50% formamide for low, medium and high stringencies, respectively), following 
standard Southern blotting procedures. The carrier material is finally washed three times each 
for 30 minutes using 2 x SSC, 0.2% SDS preferably at least 50°C (very low stringency), more 
preferably at least 55°C (low stringency), more preferably at least 60°C (medium stringency) 
more preferably at least 65°C (medium-high stringency), even more preferably at least 70»C 
(high stringency), and most preferably at least 75°C (very high stringency). Molecules to which 
the oligonucleotide probe hybridizes under these conditions are detected using X-ray film 

The nucleic acid sequences of the present invention may be obtained from 
microorganisms of any genus. For purposes of the present invention, the term "obtained from" 
as used herein in connection with a given source shall mean that the polypeptide encoded by the 
nucleic acid sequence is produced by the source or by a cell in which the nucleic acid sequence 
from the source has been inserted. 

The nucleic acid sequences may be obtained from a bacterial source. For example the 
nucleic acid sequences may be obtained from a gram positive bacterium such as a Bacillus 
strain or a Streptomyces strain, e.g., Streptomyces lividans or Streptomyces murinus; or from a 
gram negative bacterium, e.g., E. coli or Pseudomonas sp. 

In a preferred embodiment, a nucleic acid sequence of the present invention is obtained 
from a strain of the genus Bacillus, as defined by Fergus G. Priest In Abraham L. Sonenshein 
James A. Hoch, and Richard Losick, editors, Bacillus subtilis and Other Gram-Positive 
Bacteria, American Society For Microbiology, Washington, D.C., 1993, pages 3-16. 

In a more preferred embodiment, the nucleic acid sequences are obtained from a 
Bacillus alkalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus 
coagulans, Bacillus firmus, Bacillus lautus, Bacillus lentus, Bacillus licheniformis, Bacillus 
megaterium, Bacillus pumilus, Bacillus stear other mophilus, Bacillus subtilis, or Bacillus 
thuringiensis strain. 

In a most preferred embodiment, the nucleic acid sequence is obtained from Bacillus 
strain NCIB 12513, e.g., the nucleic acid sequence set forth in SEQ ID NO:41 . In another most 
preferred embodiment, the nucleic acid sequence is the sequence contained in plasmid 
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Strains of these species are readiiy accessible to the pubho m a number of cuiture 
eollecfroTsuch as foe Senear, Type Cu lt ure Co.leotion (ATCC), Deutsche — ™» 
collections, sucn C entraalbureau Voor Schrrnmelcultures 

5T!rSi?SS -latent Cu,ture Collection, Northern Kegiona, 

ReSeaK ^:S nucleic acid se q ue„ces may be identified aod obtained from other 
sources including microorganisms isoUted from nature (e.g., son, composts, 
the above-mentioned probes. Techniques for isolating microorgarusms from natural hab tafc 
Z VIZ™ in foe art. The nuc.eic acid sequence may then be denved by strmlarly 

encoding a polypeptide has been detected with the probe(s), tire sequence may be .solateti or 
Zl by utillg techniques which are Known to those of ordinary ski,, in the art (see, e.g., 

Sambrook era/., 1989, s«pra). „w« !*■ 

The techniques used to isoiate or done a nucleic acid sequence encoding a polypeptide 
' are known in tire It and include isolation from genomic DNA, preparation from cDNA or a 
rmtiZl thereof. The cloning of the nucleic acid sequences of the present mvention from 
such genomic DNA car, be effected, e.g., by using the well known polymerase cham reacnon 
S T antibody screening of expression libraries ,„ detect cloned DNA fra^nen. wr* 
0 Led structural features. See, e.g., .nnis e, al., 1990, PCR: A GuUe ,o **** «, 
Applicati o n , Academic Press, New York. Other nucleic acid amplification procedures auc J 
UgL chain reaction (LCR), ligated activated transcription (LAT) and nucletc acrd ^uence- 
hld amp.ification (NASBA) may be used. The nuCeic acid sequence may 
attain of Bacillus, or another or related organism and thus, for example, may be an allehc or 
25 speoiesvariantofmepolypeptideencodmgregionofthenucleicacidsequence. 

Modification of a nucleic acid sequence of the present invention may be necessary for 
tire syn^Tof polypeptides substantially similar to foe po.ypeptide. Tbe term "substenttally 
foe poly^tide refers to non-naturally occurring forms of the polypeptide. These 
^1 Zy dfffer in some engineered way from the polypeptide isolated from r* native 

variants differ in specific activity, thermostability, pH optimum, or foe like usmg, e.g., arte- 

aeid sequence presented as the po.ypeptide encoding par, of SEQ ID NO:41, 
subsequence thereof, and/or by introduction of nucleotide substitutions which do not gwe « 
35 " another amino acid sequence of foe polypeptide encoded by foe nucle.c ac.d sequence, but 
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which corresponds to the codon usage of the host organism intended for production ofthe 
enzyme, or by introduction of nucleotide substitutions which may give rise to a different amino 
acid sequence. For a general description of nucleotide substitution, see, e.g., Ford el al, 1991, 
Protein Expression and Purification 2: 95-107. 

• It ™ llbea PP OT » t <°*^^lledinmea rt tosnch S nbsUtotions< ; anbemadeou B ide 
the regions critical to the taction of the molecule and still result in an active polypeptide 
Ammo acd residues essentia, to the activity ofthe polypeptide encoded by the isolated nucleic 
actd sequence ofthe invention, and therefore preferably not subject to substitution, may be 
.dentified according to procedures known in the art, such as site-directed mutagenesis or 

. alamne-scanning mutagenesis (see, e.g., Cunningham and Wells, 1989, Science 244- 1081 
1085). In the latter technique, mutations are introduced a, every positively charged residue in 
the molecule, and the resultant mutant molecules are tested for protease activity to identify 
ammo acd residues that are critical to the activity of the molecule. Sites of substrate-enzyme 
mteractton can also be determined by analysis of the three-dimensional structure as detained 
by such techmques as nuclear magnetic resonance analysis, crystallography or photoaffinity 
labelling (see, e.g., de Vos et al., 1992, Science 255: 306-312; Smith e, al., 1992, Journal of 
Molecular Biology22A: 899-904; Wlodaverera/., 1992, FEBS Letters 309- 59.54, 

Anucleic acid sequence of the present invention may also encode fused polypeptides or 
cleavable ftston polypeptides in which another polypeptide is fosed a, the N-«erminus or the C 
termmus of the polypeptide or fragment thereof. A fused polypeptide is produced by fusing a 
nuele,c ac.d sequence (or a portion thereof) encoding another polypeptide to a nucleic acid 
sequence (or a portion thereof) of the present invention. Techniques for producing fusion 
polypeptides are known in the art, and include ligating the coding sequences encoding the 

conZ^ " *" " " ^ "* that ° f * e «- » -*r 

control ofthe same promoter(s) and terminator. 

Nucleic Acid Constructs 

The present invention also relates to nucleic acid consmxts comprising a nucleic acid 
sequence of the present invention operab.y linked to one or more control sequences which 
dm. the expresston of the coding sequence in a suitable host cell under conditions compatible 
wtth the control sequences. Expression will be understood to include any step involved in the 
production of the polypeptide having protease activity including, bu, not limited to 
transition, post-transeriptional modification, translation, post-translational modification, and 
secretion,. 

" Nuc ' ei ; *° id «■*«■ ■ Opined herein as a nucleic acid molecule, either single- or 
double-stranded, which is isolated from a natumlly occurring gene or which has been modified 
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- — r- - ttr tr^r— j 0 : — 
izr^r^i - — — - - — tt 

Trr D Lon of a coding sequence of fine present invention. The term "coding 
sequent as deft. dte™. £ ^ ^ by , ribosome 

^Clloffi.eop^Ld.g^e^eS-endoffi.e.nKKAanda 

end of the mRNA. A coding sequence can include, but is no, touted to, DNA, cDNA, and 
recombinant nucleic acid sequences. 

An isolated nucleic acid sequence encoding a po.ypep.ide may be man.pn.ated m a 
variety of ways to provide for expression of the polypeptide having protease act.v,^ 

depending on the expression vector. The techniques for mod.fy.ng nucle.c ac.d 
seuuences utilizing cloning methods are well known m the art. 

m Z e term "control sequences" is denned herein to include all components whtch am 
necessary or advantageous for fine expression of a polypeptide. Each control sequence may b 
"r foreign J fine nucleic acid sequence encoding the polypeptide. Such con^ 
fences include, but are not limited to, a leader, a propeptide sequence, a promoter a , goal 

, "lauanscriptionurminator. At a minimum, fine control sequences mclude 
promoter and transcriptional aod translation* stop signals. The control sequences may be 
"with linkers l^^^Ur^^^^^^™ 
Z .he control sequences with the coding region of fine nucleic acd sequence »codmg a 
polypeptide. The term "operably linked" is defined hercn as a configurahon » winch a 

25 ^sequence is appropriately placed at a position relative to fine codmg sequence of the 

S control sequence may be an appropriate promoter sequence a nude, ac.d 
seauence which is recognized by a host cell for expression of the nucleic ac.d sequence The 
ZTr ince conSL transcriptional conho. sequences which mediate fine expression of 
„ CZeptide The promoter may be any nucleic acid sequence which shows optional 

M0 ^:Tl::^ for directing the transcription of fine nucleic acid 
from fine E. co,i lac operon, me Streps coelicolor agaxase gene (dagA), fine Bacillus 
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levansucrase gene the Bacillus Uche „ iformts ^ - ' 

Bacillus s.earolHermo.Mus maltogenic ^ ^ ^ ^ ^ 
alpha-amy ase gene « the Bacillus llckenlformls penicillinase gene ^ the Bac///lts 

^ and #B gene,, and the prokaryotic beta-lactamase gene (Villa-Kamaroff *, «, 
im.rroc~aHs.ir*. Notional Academy of Sciences USA 75: 3727-3731), as well as the J 
Promoter (DeBoer e, al, ,983, o/ , fe Clonal Academy of Sciences USA 80: 21- 

25). Furd,er p roI » oter s a. described in "Useful protein from recombinant bacW 
W^e,;,™, 1980, 242: 74-94; and in Sambrook e, < 19 89 m/ ™ 

'^^.sftnKoona.to.h.hdstceUofoteicenu.ybe^edta 

an nJTwTh Se<,lKnCe 7 * "~ 3 region of 

Imked to .he 5 temunus of fc nucIeic add e W 

amino a d "7"" ^ *" 3 ^ M "** ^ ion ' whid > «— «* an 

ZL . seq T- ,0 *■ amino of a ***** *** - dtaTZ 

eroded polypept.de mto the cell's secretory pathway. Tie 5' end of the coding sequence 

m —on reading flnme win, the segment of the coding region whL LxT^SJ 
polypeptide. Alternatively, the 5> end of the coding sequence .nay contain a £££ 
codmg re g,on wh.ch is foreign to the coding sequence. The foreign signal peptide co^ 
regmn may be required where the coding sequence does no, normally conLn a^p^ 
codmg reg .on. Alternatively, the foreign signal pepdde coding region may simply" * 
natuml s,gnal pepdde coding region in order to obtain enhanced secretion of me pdyLptide 

Bacllus spec.es, or the calf preprochymosin gene. However, any signal peptide coding region 
oe used in the present invention. 7 
An effective signal peptide coding region for bacterial host cells is the signal peptide 
codmg region obtained from the maltogenic amylase gene from Bacillus NCIB 11837 the 
Bacillus s.earo.nermopMus alpba-amylase gene, the Bacillus llcnenlformls subtilisin gene', the 
Bacllus llcnenlformls beta-lactamase gene, the Bacillus scearo.nermopMus neutra, proteases 
genes ( „ P rT, nprS , npr^ or the Bacillus suMlls P rsA gene. Furdtcr signal peptides are 
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described by Simonen and Palva, 1993, Microbiological Reviews 57: 1 09-137. 

The control sequence may also be a propeptide coding region, which codes for an ammo 
acid sequence positioned at the amino terminus of a polypeptide. The resultant polypeptide » 

Serally inactive and can be converted to a mature active po.ypept.de by catalytic or 
^ydccleavageofu*^^ 

may be obtained from me Bacillus subtitis alkaline protease gene ferQ. or the Bacillus subttUs 

neutral protease gene (nprT). . 

Where both signal peptide and propeptide regions are present at the ammo termmus of a 
polypeptide, the propeptide region is positioned next to the amino terminus of the polypeptide 

The nucleic acid constructs of the present invention may also comprise one or more 
nucleic acid sequences which encode one or more factors that are advantageous for directing the 
expression of tire polypeptide, e.g., a transcriptional activator (,g a ^-acting factor y 
., chaperone, and a processing protease. Any factor that is functional m the host c U of 

may be used in the present invention. The nucleic acids encoding one or more of these factors 
are not necessarily in tandem with the nucleic acid sequence encoding the polypepude. 

A transcriptional activator is a protein which activates transcrtptton of a nucletc acrd 
sequence encoding a po.ypeptide (Kudla e, al, .990, EMBO Journal 9: 1355-1364; Jural and 
o Buxton, 1994, Current Genetics 26: 2238-244; Verdier, 1990, Yeas, 6: 27,-297). The nuc etc 
acid sequence encoding an activator may be obtained from the gene encoding Bacillus 
stearothermophilus tip, A (npr A). 

A chaperone is a protein which assists another polypeptide to fold properly (Hard et al. , 
,994 TIBS 19: 20-25; Bergeron « al, 1994, TIBS 19: 124-128; Demolder e, al, 1994, Journal 
25 of Biotechnology 32: ,79-189; Craig, .993, Science 260: 1902-1903; Gething and Sambrook, 
,992 Vature 355: 33-45; Puig and Gilbert, 1994, Journal of Biological Chemistry 269: 7764- 
777,' Wang and Tsou, 1993, The FASEB Journal 7: 1515-11157; Robinson et al, 1994, 
Bio/Technology 1: 381-384; Jacobs et al, 1993, Molecular Microbiology 8: 957-966). The 
„ u c,eic acid sequence encoding a chaperone may be obtained from tire genes encoding Bacillus 
3„ subtil* GroE proteins and Bacillus subtilis PrsA. For further examples, see Gething and 
Sambrook, 1992, supra, and Hartl et al, 1994, supra. 

A processing protease is a protease that cleaves a propeptide to genetate a mature 
biochemically active polypeptide (Enderiin and Ogrydziak, ,994, Yeas, 10: 67-79; Fuller etal 
,989 Proceedings ofthe National Academy of Sciences USA 86: ,434-1438; Julius et al, ,984, 
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Cell 37: 1075-.089; Julius e, at., .983, Cell 32: 839-852; U.S. Patent No. 5,702 934) The 
nucleic acid sequence encoding a processing pmtease may be obtained from the genes encoding 
Saccharomyces cerevisiae dipeptidylaminopeptidase, Saccharoses cerevisiae Kex2 
Yanovia lipoidica dibasic processing endoprotease (xpr6) , and JW otysporun, 
metalioprotease (p45 gene). 

It may also be desirable to add regulatory sequences which allow the regulation of the 
e*„ of the polypeptide relative to the gmwti, of the host coH. Examples ofre ^ 

£T. r T T 1,16 exprcssion of 1,16 gene 10 be tumed °" OT ° ff - «£L 

chenucal or phystca. shmulus, inc.uding me presence of a regulatory compound. Regulatory 
systems m prokaryotic systems w„u.d include the lac, tac, and trp opemtor systems. Other 
e^nples of regulatory sequences are those which aUow for gene amp.ification. In eukaryotic 
systems, these mclude the dihydrofolate reductase gene which is amplified in the pJI of 
methotrexate, and the metauothionein genes which are amplified with heavy meta, IZ 

^Z n ZT encodtog ta M would * « — - - 

Expression Vectors 

top stgnals. The vanous nucleic acid and control sequences described above may be joined 
together to ptoduce a recombinant expression vector which may include one or more 
"miction sites to allow for insertion or substitution of the nuc.eic acid sequTI 
coding the polypept.de a, such sites. Alternative* the nuc.eic acid sequence of the preset 

Z 7 eXPreSSCd ^ inMning ^ nUC ' eiC «* "««~ OT * —* -id eoltrue, 
compnsmg the sequence into an appropriate vector for expression. In citing the expression 

hnked wtth the appropnate control sequences for expression, and possib.y secretion 

The recombinant exptession vector may be any vector (e.g. , a plasmid or virus) which 
can be convenrentiy subjected to recombinant DNA precedures and can bring about the 
express™ of the nucleic acid sequence. The choice of the vector wi.. typically defend on «L 
compahbmty of me vector with the host cel. into which the vector is to be introduced 2 
vectors may be linear or closed circular plasmids. The vector may be an autonomously 
repeating vector, i.e., a vector which exists as an extmchmmosoma. entity, the replication of 
whtch is ^dependent of chromosomal rep ,icati„n, e. g ., a plasmid, an extiachmmosomal 
element, a m-mchmmosome or an artificia. chromosome. The vector may contain any means 
for assunng se,f-rep,ication. Alternatively, me vector may be one which, when introduce 
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the host ceU, is integrated into the genome and related together with the chromosome(s) into 
"1 been unrated. The vector system may be a single vector or plasmrd or «o o 
™« victors or plasmids which together contain the total DNA to be mtroduced rnto the 
genome of the host cell, or a transposon. 

The vectors of the present invention preferably contain one or more selectable markers 
which permit easy selection of transformed cells. A selectable marker is a gene the product of 
^ prides L biocide or viral — ce, re— to heavy metals, prototrophy to 
auxotiophs, and the like. Examples of bacterial selectable markers are the da, genes from 
BaOUus subtilis or Bacillus HcHenifomUs, or markers which confer antibiotic resistance such as 
amoicillin kanamycin, chloramphenicol, or tetracycline resrstance. 

Z vectol of the present invention preferably contain an elements) that perm,, stab e 
integration of the vector into the host cell genome or autonomous replication of the vector m me 
cell independent of the genome of the cell. 

For intention into the host cel. genome, the vector may rely on the nuclerc ac,d 
5 sequence encoding the polypeptide or any other element of the vector for stable mtegration of 
* vector into tie genome by homologous or nonhomologous recombmaU on Ate-* ly, 
the vector may contain additional nucleic acid sequences for Among mtegranon by 
ZJZ* recombination htio me genome of the host ce„. The additional nuc erc acrd 

,o th chromosome(s). To increase the likehhood of integration a, a pmc.se locau^ to 
integrationa. elements shomd preferably contain a sufficient number of nuclem acrds, «b- 
,00 te ,,500 base pairs, preferably 400 to 1,500 base pairs, and most preferably 800 »UO0 
hase pairs, which are highly homologous with the corresponding target sequence to enhance d* 
probity of homologous recombination. The integradona, elemen. may 

2 s Lishomologouswimthemrgetsequenceinmegenomeofmehostcell. *e 

hand, the vector may be integrated into the genome of the host cel. by non-homologous 

^rinomous replication, the vector may mrther comprise an origin of replication 
a„ enabling the vector to replicate autonomously in the host cell in question « ° f 
bacteria, origins of replication are me origins of rephcanon of plasmrds P BR322, pUC19, 
!lcYC77 and pACYC184 permining replication in E. col, and pUB.10, P E194, pTA,060, 
and pAMfil permitting replication in Bacillus. The origin of replication may be one havmg a 
.nutation which makes its functioning temperature-sensitive in the host cell (see, e.g., Ehrhch, 
35 l9 n, Proceedings ofihe National Academy of Sciences USA 75: U33). 
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More than one copy of a nucleic acid sequence of the present invention may be inserted 
into the host cell to increase production of the gene product. An increase in the copy number of 
the nucleic acid sequence can be obtained by integrating at least one additional copy of the 
sequence into the host cell genome or by including an amplifiable selectable marker gene with 
the nucleic ac,d sequence where cells containing amplified copies of the selectable marker gene 
and thereby additional copies of the nucleic acid sequence, can be selected for by culturing the 
cells in the presence of the appropriate selectable agent. 

The procedures used to ligate the elements described above to construct the recombinant 
expression vectors of the present invention are well known to one skilled in the art (see, e.g., 
Sambrook et al, 1989, supra). 

Host Cells 

The present invention also relates to recombinant host cells, comprising a nucleic acid 
sequence of the invention, which are advantageously used in the recombinant production of the 
polypeptides. The term "host cell" encompasses any progeny of a parent cell which is not 
identical to the parent cell due to mutations that occur during replication. 

A vector comprising a nucleic acid sequence of the present invention is introduced into 
a host cell so that the vector is maintained as a chromosomal integrant or as a self-replicating 
extra-chromosomal vector. Integration is generally considered to be an advantage as the nucleic 
acid sequence is more likely to be stably maintained in the cell. Integration of the vector into 
the host chromosome may occur by homologous or non-homologous recombination as 
described above. 

The choice of a host cell will to a large extent depend upon the gene encoding the 
polypeptide and its source. The host cell may be a unicellular microorganism, e.g., a 
prokaryote, or a non-unicellular microorganism, e.g., a eukaryote. Useful unicellular cells are 
bacterial cells such as gram positive bacteria including, but not limited to, a Bacillus cell, e.g., 
Bacillus alkalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus 
coagulans, Bacillus firmus, Bacillus lautus, Bacillus lentus, Bacillus licheniformis, Bacillus 
megaterium, Bacillus pumilus, Bacillus stearothermophilus, Bacillus subtilis, or Bacillus 
thuringiensis; or a Streptomyces cell, e.g., Streptomyces lividans or Streptomyces murinus, or 
gram negative bacteria such as E. coli and Pseudomonas sp. In a preferred embodiment, the 
bacterial host cell is a Bacillus lentus, Bacillus licheniformis, Bacillus stearothermophilus or 
Bacillus subtilis cell. 

The introduction of a vector into a bacterial host cell may, for instance, be effected by 
protoplast transformation (see, e.g., Chang and Cohen, 1979, Molecular General Genetics 168: 
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111-115) by using competent cells (see, e.g., Young and Spizizin, 1961, Journal of 
BaCerio'logy 81: 823-829, or Dubnau and Davidoff-Abelson, 1971, Journal of Molecular 
Bi ol 0g y 56: 209-221), by electroporation (see, e.g., Shigekawa and Dower, .988, Bio .eckniaues 
6: 742-751), or by conjugation (see, e.g., Koehler and Thome, 1987, Journal of Bac,eriolo & 
169: 5771-5278). 

M£,h0 te":I C — also re,ates to methods for producing a polypeptide comprismg 
(a) cultivating a host cell under conditions suitable for production of tire polypept.de; and (b) 

^^^^[^tlM^prc^n'^dc^methods of the present invention, the cells are cultivated in a nutrient 
medium suitable for production of tire polypeptide using methods known in tire art. For 
"I7e the cell may be cultivated by shake flask cultivation, small-scale or large-scale 

: « t dushia. fermentors performed in a suitable medium and under conditions aUowmg the 
peptide to be expressed and/or isolated. The cultivation takes place m a surtable nutnem 

in the art (see, e.g.,M.V. Arbige etal.> In Abraham L. Sonenshein, James A. Hoch, and Richard 
Losiek editors. Bacillus subtilis and Other Gram-Posi^e Baceria, American Society For 
UM Z • 7 w DC 1993) Suitable media are available ftom commercial 

o Microbiology, Washington, L>.<-., »»> • 3U "» 

suppliers " may be prepared according to published compositions (e.g., in catalogues of the 

^peptide can be recovered direetiy ftom tire medium. If tire polypeptide is not secreted, .. 
can be recovered from cell ly sates. 
25 The polypeptides may be detected using methods known in the art that are specific for 

I enzyTproduct, or disappearance of an enzyme subsftate. For example, an enzyme assay 
"sedtodeterrninetiieactivityofthepolypeptide. Procedures for detemumng protege 
activity are known in tire art and include, e.g., measurement of fluorescence resultmg from tire 
, „ hvdrolvsis of casein labeled with fluorecein isothiocyanate. 

te peptide m!y be recovered ftom tiie nutrient medium by conventional procure 
in b"t no. limited to, ceotriftigation, filtration, exaction, spray-drying, evaporation, or 

35 '"^polypeptides of tire present invention may be purified by a variety of procedures 
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known in the art including, but not limited to, odontography (e.g., ion exchange, affinhy 
hydrophobic, chromatofocusing, and size exclusion), electrophoretic procedures (eg 
preparative isoelectric Wing, differentia, solubility (e.g., ammonium sulfate precipitation)' 

T^ E ' ° r eXtoiCti0n <SeC ' **' Pr0 ' ei " Pur ^'">"' 3 - C - *— «d Lars Ryden, editors, 
5 VCH Publishers, New York, 1989). 

Removal or Reduction of Protease Activity 

The present invention also relates to methods for producing a mutant oe.I of a parent 
edl which comprises disrupting or deleting a nucleic acid sequence of the present invention or 

' 3 "Tf, T"" ^ reSUhS iD * e mUtont KH "I- Polypeptide 

encoded by the nucleic acid sequence than the parent cell. 

aceo i^ rf™*" ° f ^ ^ redUCed <* conveniently 

acoomphshed by mod,fieation or inactivation of a nucleic acid sequence of the present 
mvenuon necessary for expression of the po^pdd. having protease activity in me ceil The 
mtele,c acd sequence to be modified or inactivated may be, for example, a nucleic acid 
sequence encoding the polypeptide or apart thereof essential for exhibiting protease activity or 
tite nuc l eic «, sequence may have a regulatory function require* for the expression of the 
polypept.de from the codmg sequence of the nucleic acid sequence. An example of such a 
regulatory or control sequence may be a promoter sequence or a functional part thereof . 
par, which is sufficient for affecting expression of the polypeptide. Outer control sequences for 
possible modification are described above. 

Modification or inactivation of the nucleic acid sequence may be performed by 
^ung the cell to mutagenesis and selecting for cells in which the protease producing 
« has been reduced. The mutagenesis, which may be specific or random may be 
performed, for example, by use of a suitable physical or chemical mutagenizing agent by use of 

Examples of a physical or chemical mutagenizing agent suitable for the present purpose 
-** Mte**, (UV) irradiation, hydroxyiamine, H«^H* n >*JL g o^Z 
« O-memy, hydroxyiamine, nitrous acid, ethy, methane sulphonate (EMS) sodium 
bisulphite, formic acid, and nucleotide analogues. 

^-^^^^^fhemumgenesisis^icallyperfonnedbyincutatingme 

coir m T 8 : ^ ™ ° f ^ mUmgeni2in8 ° f **» — -«b.e 

condmons, attd selecting for cells exhibiting reduced protease activity or production 

Modification or inactivation of production of a polypeptide encoded by a nucleic acid 
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se^ofu^presenttiwentionn^beacc^^ 

of one or more nucleotides in the nucleic acid sequence or a regulatory element required for the 
^cription or translation thereof. For example, nucleotides may be inserted or removed so as 
to result in the introduction of a stop codon, the removal of the start codon, or a change of th 
open reading frame. Such modification or inactivation may be accomplished by greeted 
nLgenesis or PGR generated mutagenesis in accordance with methods Know m the art. 
Although, in principle, the modification may be performed in wvo, ,,, dhectly on fine ce. 
expressing the nucleic acid sequence to be modified, it is preferred that the mod.ficat.on be 
performed in vitro as exemplified below. 

An example of a convenient way to inactivate or reduce producuon by a host cell of 
choice is based on techniques of gene replacement or gene interruption. For example, m the 
gene interruption method, a nucleic acid sequence corresponding to the endogenous gene or 
gene fragment of interest is mutagenized in vUro to produce a defective nucleic acid sequence 
which is then transformed into the host cel. to produce a defective gene. By homologous 
s recombination, tine defective nucleic acid sequence replaces the endogenous gene or gene 
fragment. It may be desirable that the defective gene or gene fragment also encodes a marker 
wlTh may be used for selection of transformants in which the gene encoding tire polypephde 
has been modified or destroyed. 

Alternatively, modification or inactivation of a nucleic acd sequence of the present 
o invention may be performed by established anti-sense techniques using . nucleot.de sequence 
complementary to the polypeptide encoding sequence. More specifically, produchon ofjhe 
polype by a cell may be reduced or eliminated by introducing a nuclide sequence 
complementary to the nucleic acid sequence encoding the polypephde whrch may be 
transcribed in the cell and is capable of hybridizing to the polypeptide mRNA produced m the 
,s cell Underconditiomallowingthecomplemen^anti.sensenucleotidcsequencetoh y bnd.ze 
' to the polypeptide mRNA, tire amount of polypeptide translated is thus reduced or ehmmated. 

It is preferred that the cell to be modified in accordance with the methods of the present 
invention is of microbial origin, for example, a Bacillus strain which is suitable for the 
production of desired protein products, either homologous or heterologous to the cell. 

The present invention further relates to a mutant cell of a parent cell winch compnses a 
disruption or deletion of a nucleic acid sequence encoding tire polypeptide oracontrol sequence 
.hereof, which results in the mutant cell producing less of the polypeptide than the parent cell. 

The polypeptide-deficient mutant cells so created are particularly useful as host cells for 
Ae expression of homologous and/or heterologous polypeptide, Therefore, the present 
35 invention further relates to methods for producing a homologous or heterologous polypephde 
comprising (a) culturing the mutan, cell under conditions suitable for produchon of tire 
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polypeptide; and (b) recovering the polypeptide. In the present context, the term "heterologous 
polypeptides" is defined herein as polypeptides which are not native to the host cell a native 
protein in which modifications have been made to alter the native sequence, or a native protein 
whose expression is quantitatively altered as a result of a manipulation of the host cell by 
recombinant DNA techniques. 

In a still further aspect, the present invention relates to a method for producing a protein 
product essentially free of protease activity by fermentation of a cell which produces both a 
polypeptide encoded by a nucleic acid sequence of the present invention as well as the protein 
product of interest. The method comprises adding an effective amount of an agent capable of 
mhibitmg protease activity to the fermentation broth either during or after the fermentation has 
been completed, recovering the product of interest from the fermentation broth, and optionally 
subjecting the recovered product to further purification. This method is further illustrated in the 
examples below. 

In a still further alternative aspect, the present invention relates to a method for 
producing a protein product essentially free of protease activity, wherein the protein product of 
interest is encoded by a DNA sequence present in a cell which also contains a nucleic acid 
sequence of the present invention encoding the polypeptide having protease activity The 
method comprises cultivating the cell under conditions permitting the expression of the product 
subjecting the resultant culture broth to a combined pH and temperature treatment so as to 
reduce the protease activity substantially, and recovering the product from the culture broth 
Alternatively, the combined pH and temperature treatment may be performed on an enzyme 
preparation recovered from the culture broth. The combined P H and temperature treatment may 
optionally be used in combination with a treatment with a protease inhibitor 

In accordance with this aspect of the invention, it is possible to remove at least 60% 
preferably at least 75%, more preferably at least 85%, still more preferably at least 95% and 
most preferably at least 99% of the protease activity. It is contemplated that a complete 
removal of protease activity may be obtained by use of this method. 

The combined pH and temperature treatment is preferably carried out at a pH in the 
range of 6.5-7 and a temperature in the range of 25-70°C for a sufficient period of time to attain 
the desired effect, typically about 30 to 60 minutes. 

The methods used for cultivation and purification of the product of interest may be 
performed by methods known in the art. 

The methods of the present invention for producing an essentially protease-free product 
is of particular interest in the production of prokaryotic polypeptides, in particular Bacillus 
proteins such as enzymes. The enzyme may be selected from, e.g., an anxiolytic enzyme 
lipolytic enzyme, a proteolytic enzyme, a cellulytic enzyme, an oxidoreductase or a plant cell- 
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wall degrading enzyme. Examples of snch enzymes include an aminopeptidase, amylase, 
amyloglucosidase, carbohydrase, carboxypeptidase, catalase, cellulase, chitinase cunnase, 
oycllxtnnglycosyl.ransfemse.deoxynbonue.ense, esterase, galaetos.dase, beta- g alac,os,d^e 
Jucoamylase, glucose oxidase, glneosidase, ha.operoxidase, hemicellulase, invertase, 

phytase, phenoloxidase, polyphenoloxidase, proteolytic enzyme, nbonuclease, a Werase 
n^lutiinase, or xylanase. The protease-deficien, cells may also be used to express 
heterologous proteins of pharmaceutical interest. 

I, will be vmderstood that the term "prokaryotic polypeptides" includes not only nauve 
polypeptides, but also those polypeptides, e.g., enzymes, which have been modified by am.no 
acid substitutions, deletions or additions, or other such modifications to enhance activity, 
thermostability, pH tolerance and the like. 

In a further aspect, the present invention relates to a protein product essentially free 
from protease activity which is produced by a method of the present invention. 

^ The recombinant polypeptides encoded by the nucleic acid sequences of the present 
invention may be used in conventional applications of proteolytic enzymes, particularly at a 
high pH eg., in laundry and dishwashing detergents, institutional and industrial cleamng, and 
„ learner processing. The recombinant polypeptides are particularly useful in detergents because 
of their enhanced ability toward oxidation under alkaline conditions, e.g., bleaching agents of 

^Te^combinant polypeptides may also be used in numerous outer applications 
including debittering or enhancing the degree of hydrolysis of protein hydro.ysates, flavor 
, 5 development through hydrolysis of a protein, degradation of undesirable peptides, and 
' enzymatic synthesis of peptides. The use of proteases in these and other apphcations are well 
established in the art. 

The present invention is further described by the following examples which should not 
30 be construed as limiting the scope of the invention. 

Examples 

All primers and oligos were synthesized on an Applied Bioaystems Model 394 
35 Synthesizer (Applied Biosystems, Inc., Foster City, CA) according to die manufacturer s 
instructions. 
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Example 1: Construction of Bacillus subtilis donor strain BW154 

Several genes (spoIIAC, aprE, nprE, amyE, and srfQ were deleted in the Bacillus 
subtilis A164 (ATCC 6051 A) and 1630 (NCFB 736) host strains described herein. In order to 
accomplish this task, plasmids containing deleted versions of these genes were introduced into 
these strains using the P LS20-mediated conjugation system (Koehler and Thorne, 1987, supra). 
Briefly, this system is comprised of a Bacillus subtilis "donor" strain which contains a large 
plasmid designated pLS20. pLS20 encodes the functions necessary for mobilizing pLS20 into 
a "recipient" strain of Bacillus subtilis. In addition, it has been shown that plasmids such as 
pUBl 10 and pBC16 are also mobilized by this conjugation system (in the presence of P LS20) 
These plasmids contain a ^-acting region (or/7) and a gene {orf-beta) encoding a /reacting 
function that acts at the oriT site and facilitates the mobilization of these plasmids into a 
recipient strain. Plasmids containing only oriT can also be mobilized if the donor strain 
contains both pLS20 and either pUBl 10 or pBC16 (in this case, orf-beta function is provided in 
trans). 

The P LS20 plasmid or a derivative such as pXO503 (Koehler and Thome 1987 supra) 
must be present in order for a strain to be a proficient donor. In addition, it is also desirable to 
have a means of counter-selecting against the donor strain after the conjugation has been 
completed. A counter-selection scheme has been developed that is very "clean" (no 
background) and easy to implement. This involves introducing a deletion in the dal gene of the 
donor strain (encodes the D-alanine racemase enzyme which is required for cell wall synthesis) 
and selecting against the donor strain by growing the cell mixture from a conjugation 
experiment on solid media devoid of D-alanine (this amino acid must be added exogenously to 
the media in order for a dal- strain of Bacillus subtilis to grow). 

In order to delete the genes mentioned above, pE194 replicons (erythromycin 
resistance) (Gryczan et at. 1982, Journal of Bacteriology 152: 722-735) containing deleted 
versions of the genes and the oriT sequence had to be mobilized into the Bacillus subtilis A164 
and A1630 strains. A suitable donor strain should have the following characteristics: 1) a 
deletion in the dal gene (for counter-selection) and 2) it must also contain pLS20 (pXO503 
would be unsuitable in this case since the pE194 replicons must be maintained by erythromycin 
selection and pXO503 already confers resistance to this antibiotic) and either pUBHO or 
pBC 16 to supply orf-beta function in trans. A description of how Bacillus subtilis BW154 was 
constructed as a donor strain follows. 

(A) Introduction of a dal deletion in Bacillus subtilis to yield Bacillus subtilis BW96. 
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« a strain of Bacillus suttais with a motion in the bad gene (*is mutation 
abolish^ ability of the strain to synthesize the dipeptide antibiotic bacilysm) was chosen 
« wild-type teflto -** eeils actually W other species of Mb. during the 
cXhon piss and this Killing potentiai is greatly reduced in cells which are b *c,-. 
Wore.alidonorstrainshavebeenconstruc.edinato-ibackground. 

The first step in corseting a suitable donor sttain was to deiete a pomon of the ^ 
gene in the BaciUus strain 1A758 which is b ac-, (BacUius Stock Center, Columbus, 

OH) ^deleted version of the dal gene was constructed in vitro which could be exchanged for 

prrr^r^CAGAOATACGTGOOC-^SEQIDNO.) 

^2: 5'-GGATC£ACACCAAGTCTGTTCAT-3' (SEQ ID NO:2) (o-HIsite 

SLr4- 5'-AAQ£XEATCTCATCCATGGAAA-3' (SEQ ID NO:4) (Hfcdin sue underhned) 
The a^auon reactions (.00 p.) — the foi.owing components: 20^ M 
Saaius suMlis ,68 chromosomal DNA, 0, pM of each primer, 200 pM each of dATP^ 

: ^.^Mc^S: 15.-I* Tbereactionswere performed under fiieoHowrng 
ZLJlc for 3 minutes, men 30 cycies each at 95'C for , minute, 50-C for 1 mmu* 
^ ^CfonminutcfofiowedbySminutes at 72-C. Reactions ^ ™« 
* n„ t fc thP V and 3' PCR products were cloned into the pCKll 

instructions. A pCRII clone was identified which contained the 5' half of the dal 
orientation such that the BotwHI site introduced by the PCR primer was adjacent to 
sitt of the pCRil poly.in.er (the other oriental would place fire - .much fcrther 
30 apart) The pCRII clone containing the 3' half of the dal gene was then digested with BamHi 
H« - the * gene fragment was then Coned into the B^HI— site of*e 
aforementioned pCRII clone containing the 5- half of the *, gene. ^generated pOU 
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the 5' end (part of the pCRII polylinker) and .JU site at the 3' end of the gene 

In order to introduce this *, deletion into the bacteria! chromosome, the deleted gene 
was cloned into the temperate-sensitive Bacittus suMOs replicon pE194 (Gryczan e, a, 
1982, supra). He de.eted rfa /gene was men introduced into the chromosome in two steps- 
s by mtegrati ng ; me plasmid ^ homologous tao ^ ^ 

oUowed by *e subae^ removal of me plasmid (again via homologous recombinati^ 
leavmg bemnd the deleted version of the * gene on Me bacterial ch _ e ^ J; 

to™ s, te of the temperature sensitive plasmid pSKVpE.94 (essentially rep.acing me 
PSK vector sequences with me dalA fragment). Plasmid pSK7p E ,94 was constructed as 
ta* Bluescrip. SK* (Stratagene, La Jo „a, CA) and pEI 94 were digested with « 
The p SK vector was men ueated with calf intestinal alkaline phosphatase and the two ptasmids 
were hgated together. The ligation mix was used to transform the E co U stmin DH5o J 
unnsfonnants were selected on LB pla.es containing ampicillin (100 pg/ml) and X-gal (5- 
. bromo^-chtaro -S-indolyl-p-D-galactopyranoside). Plasmid was purified from several "whil" 

* g ;r y 8 deC,rOph0reSiS - - jested wim HinaHl audi, 

J/A gene fragment JMH^ The ligation mix was used to transform the L-f smtin 
^ 1A758 (Bacillus Stock Center, Columbus, OH), and tranaformanta were 

ISTTt tem ~ ° f 34 ° C PlaSmid ° NA ^ PUriM *» five eryuLycin 
~ tr^formants and analyzed by restriction enzyme digeation/ge, electiopILs A 

*m harbonng mis plasmid was subsequently used for me introduction of the aal deletion into 
the chromosome via homologous recombination. 

sene „ ITT " ^ *" ^S^" ° f ^ * ^ « into the 

gen on the chromosome), me transformed strain was streaked onto a TBAB plate containing 

tompemture of 45°C A targe colony was rested under toe same conditions yielding a 
hom„ pop ^ of ^ con(aining fte tempcrature . sensitive plasmjd J ' 

me M gene on the chromosome. A, the non-permissive temperature, only cells which 
contamed the plasmid in the chromosome were capable of growing on eryti^cin ^ 
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plas mid was incapable of replicating. In order to obtain the second cross-over even, (resulting 
to excision of the plasmid from me chromosome leaving behind the deleted versron of the dal 
gene) a loopful of cells was transferred to 20 ml of Luria broth supplement with D-alanme 
(0 1 mg/m.) and grown to late log phase without selection a. the permissive temperature of 
34°C to permit function of the origin of replication and occurrence of the second cross-over 
event Cells were transferred 4 times more (1/100 dilution each transfer) to allow the plasmrd 
to excise from the chromosome and segregate out of the popu.at.on. Finally, cells were plated 
for single colonies a. 34°C on TBAB plates supplemented wim D-alanine (0.1 mg/ml) and 
replica-plated onto TBAB plates without D-alanine (0.1 mg/ml) and TBAB plates wrth D- 
alanine (0.1 mg/ml) and erythromycin (5 ug/ml) to score colonies which were dal- and ernf. 
Two out of 50 colonies yielded this phenotype. The resulting sttain was designated Bacillus 
subtilis BW96, a bac-1, dal- strain. 

(B) Introduction of pLS20 and P BC16 into the bad, ^/-deleted Bacillus subtilis strarn to 
yield the conjugation proficient donor strain Bacillus subtilis BW1 54. 

A donor strain was chosen for introducing plasmids pLS20 and pBC16 into Bacillus 
subtilis BW96 wherein the donor attain is an erythromycin sensitive Bacillus subtilis sttnm (m 
order to provide a counter-selection against the donor strain) which contains both pLS20 and 
0BC16 A ^/-deleted Bacillus subtilis strain containing P LS20 and pBC16 was chosen as a 
suitable donor strain which was consulted as fofiows: Bacillus subtilis DN1686 (U.S. Patent 

6736-6740) to make cells erythromycin resistant. The conjugate element pLS20 was 

with Bacillus subtilis (.natto) 3335 UM8 (Koehler and Thome, 1987, supra). The 
ttansconjugants were selected as tettacycline and erythromycin resistant colonies possessmg a 
25 dal deletion. Colonies carrying pLS20 were scored by their ability to ttansfer pBC16 to other 
Bacillus subtilis strains by conjugation Finally the conjugate strain was cured of pHV1248 
by raising the temperature to 50»C yielding the donor attain: Bacillus subtilis DN1686 
containing pLS20 and pBC16. 

In order to introduce these plasmids into Bacillus subtilis BW96, a suitable counter- 
30 selection scheme had to be implemented, and therefore, Bacillus subtilis BW96 was 
nansformed with a temperature-sensitive plasmid pSK + /pE194 conferring erytbromycm 
resistance which could be subsequently removed by growth a. a non-permissrve temperature. 
The pLS20 and pBC16 plasmids were mobilized from Bacillus subtilis DN1686 contauung 
pLS20 and pBC16 into Bacillus subtilis BW96 (harboring P SK7pE194) according to the 
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following procedure. A loopful of each cell type was mixed together on a TBAB plate 
supplemented with D-alanine (50 ug/ml) and incubated at 33°C for 5 hours The cells were 
scraped from the plate and transferred to 1 ml of LB medium. The cells were spread at various 
dilutions onto TBAB plates supplemented with tetracycline (10 ug/ml), erythromycin (5 
Ug/ml), and D-alanine (50 ug/ml) and grown at 34-C to select for recipient cells which acquire 
pBCl 6 and in many cases P LS20 as well. To test whether pLS20 was also present in any of the 
transconjugants, ten colonies were tested for their ability to transfer pBC16 into Bacillus 
subtilis PL1801. Bacillus subtilis PL1801 is Bacillus subtilis 168 (Bacillus Stock Center 
Columbus, OH) with deletions of the genes apr and npr). However, Bacillus subtilis 168 may 
also be used. Donors capable of mobilizing pBC16 must contain pLS20 as well Once a 
conjugation proficient strain was identified (Bacillus subtilis bad, dal- containing P LS20 plus 
PBC16 plus pSK7pE194), the P SK7pE194 plasmid was cured from the strain by propagaung 
the cells m LB medium supplemented with tetracycline (5 ug/ml) and D-alanine (50 ug/ml) 
overnight at 45°C, plating for single colonies at 33°C on TBAB plates supplemented with D- 
alamne (50 ug/ml), and identifying erythromycin sensitive colonies. This procedure yielded 
Bacillus subtilis BW154 which is Bacillus subtilis bac-1, dal- containing pLS20 and pBC16. 
A summary of the Bacillus strains and plasmids is presented in Table 1 . 
Table 1: Bacterial strains and plasmids 



Bacillus subtilis strains: 
B. subtilis (natto) 
DN1686 
DN1280 
MT101 
1A758 
BW96 
BW97 
BW99 
BW100 
PL1801 
Plasmids: 

pBC16 
pE194 



pLS20 

dal- 

dal- 

DN1280 (pXO503) 

168 bac-1 (Bacillus Stock Center, Columbus, Ohio) 
1A758 dalA 

1 A758 dalA::cat (pXO503) 

lA758dalA(pPL2541-tet) 

1A758 dalDA (pXO503), (pPL2541-tet) 

apr A, npr A 

Mob + , Tc r 

temperature sensitive 
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pLS20 Tra + 

pXO503 Tra + ,MLS r (=pLS20::Tn917) 

pPL2541-tet Mob + , Tc r (pE194 ts ori) 

pCAsub2 Mob + , Cm', Ap r , (pE194 ts ori) 

pSKVpEl 94 Em r , Ap r , temperature-sensitive 

pShv2 Tra + , Em r , Cm r , temperature-sensitive 

pHV1248 Em 1 , temperature-sensitive 

Tra* implies that the plasmid confers upon any Bacillus subtilis strain bearing it the ability to 
conjugate, that is, the plasmid encodes all of the functions for mobilizing a conjugatable 
plasmid from the donor to a recipient cell. 

Mob* implies that a plasmid is capable of being mobilized via conjugation by a strain which 
contains a Tra* plasmid (pLS20 or pXO503). The plasmid must contain a m-actrng sequence 
and a gene encoding a trans-acting protein (or,T and orf-beta, respectively, in the case of 
pBC16) or just an oriT sequence (in the case of pPL254-tet, here a plasmid supplying orf-beta 
functions in trans must also be present in the cell as well such as pBCl 6). 

Example 2: Deletion of the spoIlAC gene of Bacillus subtilis A164 (ATCC 60S1A) 

A deleted version of the spoIIAC gene, which encodes sigma F permitting cells to 
, proceed through stage II of sporulation, was created by splicing by overlap extenston (SOB) 
technique (Horton e, al, 1989, Gene 11: 61-68). Bacillus subtilis A164 (ATCC 6051A) 
chromosomal DNA was obtained by the method of Pitcher e, al, 1989, supra. Primers 5 and 6 
shown below were synthesized for PCR amplification of a region from Bacillus subtilis A164 
chromosomal DNA extending from 205 nucleotides upstream of the ATG start codon of the 
s spoIIAC gene to 209 nucleotides downstream of the ATG start. The underlined nucleohdes of 
the upstream primer were added to create a //Mill site. The underlined nucleotides of the 
downstream primer were complementary to bases 507 to 524 downstream of the ATG 
transanal start codon. Primers 7 and 8 were synthesis to PCR-amplity a region extendmg 
from 507 to 884 nucleotides downstream of the ATG translational start codon. The underhned 
,o region of primer 7 was exactly complementary to the 3' half of primer 6 used to amphfy the 
upstream fragment. 

Primer 5: 5 ' - AAGCTTAGGC ATTAC AG ATC-3 ' (SEQ ID NO:5) 

Primer 6: 5 ' -CG^ATTCTC^GTlCAXrrrCC AGCCCG ATGC AGCC-3 ' (SEQ ID 

35 NO:6> Primer 7: «-nnrmr. ATCGGGCTGGAAAATGACGGAGATCCG-3- (SEQ ID 
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NO:7) 

Primer 8: 5 '-GATCACATCTTTCGGTGG-3 ' (SEQ ID NO:8) 

The two sets of primers were used to amplify the upstream and downstream spollAC 
fragments in separate PGR amplifications. The amplification reactions (25 pi) contained the 
s following components: 200 ng of Bacillus subtilis A164 chromosomal DNA, 0.5 pM of each 
pnmer, 200 pM each of dATP, dCTP, dGTP, and dTTP, 1 x Too polymerase buffer, and 0 625 
U of Tag DNA polymerase. Bacillus subtilis A164 chromosomal DNA was obtained according 
to the procedure of Pitcher e, ai, 1989, supra. The reactions were perfotmed under the 
following conditions: 96'C for 3 minutes, then 30 cycles each at 96»C for 1 minute 50°C for 1 
. mmute, and 72'C for 1 minute, followed by 3 minutes at 72°C to insure addition of a tetminal 
ademne residue to the amplified fragments (Invittogen, San Diego, CA). Amplification of the 
expected products was verified by electrophoresis through a 1 .5% agarose gel. 

A new PGR mixture containing 2.5 pi of each amplification reaction above was then 
performed under the same conditions but containing only primes 5 and 8, producing a 
"sphced" fragment of 1089 nucleotides, representing the spollAC gene lacking 298 internal 
nucleotides. This fragment was cloned into the pCRII vector using the Invitrogen TA Cloning 
Kit accordmg to the manufacturer's instructions, excised as a /WndHI-fcoRI fragment and then 
cloned into fflndIII/ £ coRI-diges te d pShv2. P Shv2 (Figure 1) is a shuttle vector constructed by 
hgating AM-cu, pBCSK* (Sttatagene, La Jolla, CA) containing oriT of pUBl 10 with ^nl-cut 
PE194, followed by ligation of oriT from pUBl 10 as a PCR-amplified fragment containing Ssil 
compatible ends. The onTftagment permits mobilization of the plasmid into Bacillus subtilis 
A164 by pLS20-mediated conjugation (Battisti el al., 1985, Journal of Bacteriology 162: 543- 
550). pShv2-Arpo/«C was transformed into donor strain Bacillus subtilis BW154 (Example 
1). Bacillus subtilis BW154 (pSM-tspolIAQ was used as a donor strain to introduce die 
shuttle vector containing the deleted gene into Bacillus subtilis Al 64. 

Exchange of the deleted gene with the intact chromosomal gene was effected by 
conjugation of Bacillus subtilis BW154 transformed with pShv2-Ar P o/Z 4 C with Bacillus 
subtdts A164, selection of erythromycin-resistant transconjugants, and growth at 45°C At this 
temperature, the P E194 replicon is inactive, and cells are only able to maintain erythromycin 
resistance by Campbell integration of the plasmid containing the deleted gene at the spollAC 
locus. A second recombination event, resulting in loopout of vector DNA and replacement of 
the mtac, spollAC gene with the deleted gene, was effected by growth of the strain for two 
rounds tn LB medium without antibiotic selection at 34-C, a temperature permissive for 
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fi^on of the PE194 replicon. Cdonies in which gene replacement had occurred were 
according to the foUowing criteria: 1) ahsence of erythromycin (erm) -stance 
" y the shuttie vector P Shv2, 2) decreased opacity on spon.iauon me.hum, m tcari 

nucleotide! .instead of 1089 nucleotides representing tire unde.etedvers.on of the gene. 

Example 3: Deletion of the nprE gene of BacUlus subtUis MM tspoIIAC 

An upsfream portion of tire neutra! protease (fcrJQ gene (nucleoftdes 40-610 
downstream of the GTG sunt codon) was PCR-amplified from Bacillus suMls A, 64 6**AC 
Imosomal DNA prepared in me manner descrihed in Example 2 u„mers 9. 10 
shown below. A dovmstream portion of me nprE gene (nucleoudes ,040-1560) was PCR 
^plified using primers 11 and 12 shown helow. Primers 10 and 11 were designed such to 

amplification reactions (25 pi) contained me same components and were performed under me 

same conditions specified in Example 2. 

Primer 9- 5*-CGTTTATGAGTTTATCAATC-3 (SEQ ID NO:9) 

Primer 10: 5 ' - ^ G A rTTCC.C AGTTTGC AGGT-3 ' (SEQIDNO.IO) 

Primer 11: 5'-£AAA£TGSGAAGTCTCGACGGTTCATTCTrCTCTC-3 (SEQ ID 

NO:U) primerl2- 5'-TCCAACAGCATfCCAGGCTG-3-(SEQIDNO:12) 
! ° TJZL upstienm and do— fragments were ge, 

Kit according to me manufacturer's instructions (Qiagen, Chatsworth, CA). A new PCR 
mixture (100 p.) containing approximately 20 ng of each purified fragment « .performed 
^OE reach n was performed under the following conditions: cycles 1- m .-"J 
, "Liced" frasment and cycles 4-30 in the presence of prnners 9 and 12 

- ;tj— vsSiSS- -- li - soE T^^drpr2 , a: 

pCRII vector and verified by restriction analysis. The fmgment was the, .Coned mto pShv a. 
a BanM-XHol fragment This plasmid, P Shv2« was transformed mto BaaUus - 
ImL genemtra suitable donor strain for conjugation. The plasmid was then mobmzed 
3 0 ^nLmisM^IIAC. The^ge.ewasintreducedmtomecbrom.omeof 
Stilus suMUs A164 ^IIAC by temperature shift as described in Example 2. An n^ 
phenotype was scored by patching ernt colonies onto TBAB agar p.ates supp emented w,th 

is observed. The 430 base pair deletion was verified by PCR analys,s on chromosomal DNA 
3 5 using primers 9 and 12. 
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Example 4: Deletion of (he ^riTgene of Bacillus subtilis A164 MpoIMCtoprE 

SOE was used ,o create a deleted version of the Bacillus subtilis aprE gene which 
encodes an alkaline snbtilisin protease. An upstream portion of aprE was PCR amp.if.ed nsing 
5 pnmers 13 and 14 shown be.ow from Bacillus subtilis A164 chromosomal DNA prepared as 
descrtbed m Example 2 to create a fragment extending from 189 nucleotides upstream of the 

ofpnmer 13 were included to add an EcoXl site. The underlined nucleotides of primer 14 were 
added to provide complementarity to the downaream PCR fragment and to add a San site A 

io downsteeam portion of tire aprE gene was PCR-amplified using pnmers ,5 and 16 to create a 
fragment extending from 789 nucleotides to ,306 nucleotides downstream of the aprE 
transanal start codon. Underlined regions of prime, ,4 and ,5 were added to prld! 
crnnptementantv between the upsfream and downstream fragment, The underlined nuclei 
ofpnmer 16 were included to add a Hbmi site. The amplication reactions (25 p.) contained 

» the same components and were conducted nnder tire same conditions as described in Example 

Primer 13: 5'-GCGA^TTCTACCTAAATAGAGATAAAATC-3' (SEQ IDNO-13) 

NO:l4) Primer 5 ' - OTrr AOrr,rArrT.rr T Trr„r ccTGTGTAGCCTTGA-3- ( seq id 

20 IDNOMoT" 5 '- TCAAOGCTA ^^^rn.C GTAGGTGCGOTAAAC.3- (SEQ 
Primer 16: 5'-GC^AGGT_TGACAGAGAACAGAGAAGCCAG-3 ' (SEQ ID NO" 1 6) 

PCR Punf 6 TT d0W " Strcam fiagmen ' S ^ « "*«• «- Q-luick 

2 s T^e 1 ^ 8 '° —**«-'• (Qiagen, Chafcworth, CA) 

2 s The two punfied fragments were then spliced together using primers 13 and 16 The 
amplification reaction (50 .pl) contained the same components as above except the 
chromosomal DNA was replaced with 2 pi each of the upstream and downstream PCR 
products. The reactions were incubated for 1 cycle at 96°C for 3 minutes (without the dNTPs 
and Ta g polymerase), and men for 30 cycles each at 96»C for 1 minute and 72°C for 1 minute 

reacuon product was isolated by agarose electrophoresis, cloned into pCRIl, excised'as an 
feAMMm fragment, and then cloned into £coRI/f/mdin-digested P Shv2 to yield P Shv2- 
*°prE. This plasmid was introduced into the donor strain described above for conjugal transfer 
into Bacillus subtilis A164 HspolIAC tmprE. 
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Replacement of aprE with the deleted gene was effected as described above for spoIIAC 
and m E Colonies in which aprE had been deleted were selected by erythromycin sensttivuy 
and reduced clearing zones on agar plates with an overlay containing 1% non-fat dry nulk. 
Deletion of aprE was confirmed by PCR. 

Bacillus subtilis A164 SpoIIAC InprE LaprE is herein designated Bacillus subtihs 

A164A3. 

Example 5: Deletion of the amyE gene of Bacillus subtilis A164 topollAC toprE AaprE 

SOE was nsed to create a deleted version of the amyE gene which encodes Bacllus 
subtilis alpha-amylase. An upstream portion of amyE was PCR-amplified from Bacillus 
subtilis A164 chromosomal DNA using primers 17 and 18 shown betow. Tins created a 
fragment extending from 421 nucleotides upstream of the amyE translation^ start codon to 
nucleotide 77 of the amyE coding sequence, adding a Sa/I site a. the upstream end and Sfll and 
Notl sites at the downstream end. A downstream portion of amyE was PCR-amplified using 
primers 19 and 20 shown below. This created a fragment extending from nucleot,de 445 to 
nucleotide 953 of the amyE coding sequence, and added Sfll and Noil sites at the upstream end 
and a ffl«dIII site at the downstream end. Restriction sites are denoted by underlmmg. The 
amplification reactions (25 pi) contained the same components and were conducted under the 
same conditions as described in Example 2. ,, ... . 

The two fragments were then spliced together by PCR using pnmers 17 and 20. The 
amplification reaction (25 pi) contained the same components as above except the 
chromosomal DNA was rep.aced with 2 pi each of the upstream and downstream PCR 
products. The reactions were incubated for 1 cycle a. 96°C for 3 minutes (without the dNTPs 
and Tag polymerase), and men for 30 cycles each at 96»C for 1 minute and 72°C for 1 nunute. 
, 5 This reaction fused the two fragments by overlap a, the region of complementarity between the 
two (the Sfll and No* sites) and resulted in a fragment of amyE lacking 367 nucleotides from 
the coding region and having an SJil site and a Noil site incorporated between the two portions 
of amyE. The reaction product was isolated by electrophoresis using a 1% agarose gel 
according to standard methods. This fragment was cloned into pCRII according to the 
a o manufacturer's instructions to yield pCRII-Aam>>£. 

Primer 17: S'-CGTCGACGCCTTTGCGGTAGTGGTGCTT-S' (SEQ ID NO: 1 7) (Sail 

site underlined) 

Primer 18: 5'-CG^G^CCGCAGGCCCJTAAGGC£AGAACCAAATGAA-3 (SEQ 
-29- 
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ID NO: 1 8) (Notl and Sfil sites underlined) 

Primer 19: 5 , -TGGCCTTA AGGGCCTGCGGCCGC G A Trrrr a a tp. r (SEQ ID 
NO: 19) (Sfil and Noil sites underlined) 

Primer 20: 5'-GAAGCTTCTTCATCATCATTGGCATACG-3 ' (SEQ ID NO:20) 
(Hindni site underlined) 

P Shv2.1 was created by digesting P Shv2 with Notl, filling in the cohesive ends with 
Klenow fragment and dNTPs, and religating the plasmid. This procedure destroyed the AM 
recognition site of P Shv2. The deleted amyE fragment was excised from pCRII-Aa^ as a 
Saa-Hindlll fragment and cloned into M///mdIII-digested P Shv2.1 to yield P Shx2 A-AamyE. 
This plasmid was introduced into Bacillus subtilis BW154 for conjugal transfer into Bacillus 
subtilis Al 64 AspoIIA C AnprE AaprE. 

Replacement of amyE with the deleted gene was effected as described above for 
spoIIAC, nprE, and aprE. Colonies in which gene replacement had occurred were selected by 
erythromycin sensitivity and the inability to produce a zone of clearing on starch azure overlay 
plates. Deletion of amyE was confirmed by PGR amplification of the deleted gene from 
chromosomal DNA using primers 17 and 20. 

Example 6: Deletion of the srfC gene of Bacillus subtilis A164 AspoIIAC Anpr Aapr AamyE 
to produce Bacillus subtUis A164 AspoIIAC AnprE AaprE AamyE Asr/C 

Primers 21-24 shown below were synthesized for the creation of a deletion in srfC of 
the surfactin operon. Primer 21 overlaps an existing Hindlll site (underlined) in the srJC gene 
and in conjunction with primer 22 permits PGR amplification of a region extending from 410 
nucleotides to 848 nucleotides downstream of the translations! start of srJC. The underlined 
portion of primer 22 is complementary to nucleotides 1709-1725 downstream of the ATG start 
codon. Primers 23 and 24 permit PCR amplification of a region of 1709 to 2212 nucleotides 
downstream of the translational start of srJC. The underlined portion of primer 23 is 
complementary to nucleotides 835-848 downstream of the ATG codon. The amplification 
reactions (25 ul) contained the same components and were performed under the same 
conditions as described in Example 2. 

Primer21: 5'-AAGCTTTGAATGGGTGTGG-3' (SEQIDNO:21) 

Primer 22: 5 '-CCGCTTGTTCTTTC A TrcCCTGAAACAACTGTACCG-3 ' (SEQ ID 
NO:22) V V 

Primer 23: 5 ' -C AGTTGTTTC A GGGGATGAAAG AACAAGCGGCTG-3 ' (SEQ ID 
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NO:23) prime r 24-5-.CTGACATGAOGCACTGAC3-(SEQIDNO:24) 

Id otirer con— were reeved from .he PGR product Qragcn 
PGR spin column (Qiagen, Chatswortir, CA). The complement between the two PGR 
mi fragments pennitted splicing by SOE. Tne PGR product (2 p. or approbate y 50 

nucleotide fragment that iacked an interna! 859 nucleotides of the srfC gene. The deleted 

to tire surfactin molecule, and furthermore results in a framesmft mutatron whrch resute m 
Nation of me peptide prior to me thioesterase active site-like regro, presumed to be 
i „volvedinsurfac,inrele a sefrommeSrfCpro te in(Cosminae,a/.,1993,^). 

Replacement of srJC with me deleted gene was effected as described above for spo.UC 
, ^.and^ando^. Coloniesinwh^ 

c^toomyam sensitivity, tire inabUity to produce a zone of cleanng on Wood agar plates 
Hml - .993, *— 9 .75: 6203-62,1), and lack of fcamrng upon 

eultivation for 4 days at 37'C and 250 rpm in 250 ml shake flasks coning 50 ml of PS^l 
medium composed of ,0% sucrose, 4% soybean flour, 0.42% anhydrous 

srJC was confirmed by PGR amplification of tire deleted gene from chromosomal DNA usmg 
PrimeK — If A164 ^lUC ^ ^ W to* is herein designated 
Bacillus subtilis A164 A5. 

" El a mpl «7 : Construction ot BacUlus sutoUis A.630 bspollAC tJtprE baprE hamyE 

Bacillus subtilis A1630 tspoBAC t*rE AaprE banyE torfC was constructed from 
Bacillus suaiilis A.630 (NCFB 736, formerly NCDO 736) according to tire same procedures 
described in Examples 1-6 for Bacillus suMlis A164 topoHAC toprE haprE ^E 
3 „ (Bacillus sumis A164 A5), using the deletion plasmids constructed for tire Bacillus suU,l,s 

A1M tc2 sua,, A1630 ^UC *r * — ^ > «— — 
Bacillus subtilis A1630 A5. 
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Example 8: Preparation of chromosomal DNA of Bacillus JP170 

Bacillus JP170 (NCIB 12513) was grown overnight at 37°C in 50 ml of Luria-Bertani 
(LB) broth containing 0.1 M NaHC0 3 pH 8. Genomic DNA was prepared according to the 
method of Pitcher et al, 1989, supra. 

Example 9: Preparation of probes of the Bacillus JP170 protease gene 

Based on the N-terminal and internal amino acid sequences of the Bacillus JP170 
protease (JP 4197182) shown below, primers were synthesized to clone the Bacillus JP170 
protease gene: 

N-terminus: NDVARGIVKADVAQNNFGLYGQGQIVADTGLDTGRNDS (SEQ ID N025) 
Internal peptide: GAADVGLGFPNGNQGWGRVTLDK (SEQ ID NO:26) 

The primers designated 170-291, 1701, and 1702B shown below (where I=inosine) 
were used in the amplification reactions described below. 
170-291: 5'-CCCCAICCITGITTICCITTIGGIAAICC-3' (SEQ ID N027) 
1701: 5 '-GGIATIGTI AAIGCIGAIGTIGCICAJ AAIAAITTIGG-3 ' (SEQIDN028) 
1702B: 5'-TAIGGICAIGGICAIATIGTIGCIGTIGCIGAIACIGG-3' (SEQ ID NO:29) 

Amplification reactions were prepared with 50 pmol of either primers 1701 and 170- 
291 or 1702B and 170-291, 7 ug of Bacillus JP170 chromosomal DNA as template, IX PCR 
buffer (Perkin-Elmer, Foster City, CA), 100 uM each of dATP, dCTP, dGTP, and dITP and 
0.5 U of AmpliTaq Gold (Perkin-Elmer, Foster City, CA). Reactions were incubated'in a 
Stratagene Robocycler 40 (Stratagene, La Jolla, CA) programmed for 1 cycle at 96°C for 3 
minutes and 30 cycles each at 40°C for 1 minute, 40°C for 1 minute, and 72°C for 1 minute 

Amplification with primers 170-291 and 1701 resulted in a 905 bp product designated 
1/291, and with primers 1702B and 170-291 an 863 bp product designated 2B/291 Both PCR 
products were individually cloned into the Invitrogen TA Cloning Kit vector p C R2 1 
Onvitrogen, San Diego, CA) according to the manufacturer's instructions. Sequencing with an 
Applied Biosystems Model 377 Sequencer (Applied Biosystems, Foster City, CA) showed that 
these PCR products had 90«>/„ identity to the amino acid sequence of the Ya protease disclosed 
in JP 4197182 based on alignment of the deduced amino acid sequences in the GeneAssist 
1.1M database (Applied Biosystems, Foster City, CA). The amino acid sequence of the PCR 
product also had a 35% identity to the amino acid sequence of the Bacillus serine protease 
subtilisin. 

Primers 170-291, 1701, and 1702B were then used to PCR-amplify DIG-labeled probes 
of 1/291 and 2B/291 using the Genius System PCR DIG Probe Synthesis Kit (Boehringer 
Mannheim Corporation, Indianapolis, IN) according to the manufacturer's under the same PCR 
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Fxamme 10: Screening of chromosomal libraries 

Probe 2B/291 described in Me 9 was used ,o screen a chromosomal hbrary of 
Bacilllus JP170. Tbe library was constructed by ligating SaulA partially-digested (4-8 kb) 
Bacillus JP170 chromosomal DNA into the BanM sites of tire vector pSJ1678 (Figure 2). 
Escherichia coli DH5cr (Gibco BRL, Gaithersburg, MD) ^^"^ 
chromosoma, librae and screened by colony lifts using Ore ™^J[^™™ 
following tine Genius System instnrctions. After screening approxrmately 4600 eolon.es, 1 
"bridizedn, tire probe and was designate* Clone 1. Plasmid DNA from Clone , was 
Trepld using a QIAprep 8 Plasmid Ki, (Qiagen, Chatsworti, CA) Resuncuon d.gests of 
plasmidDNAindcatedthatClone 1 contained an insert of approximately 13 kb. 

DNA from Clone 1 and Bacillus JP170 chromosomal DNA were analyzed by Southern 
hybridization using 2B/29, as a probe. Specifically, 7 ug of Bacillus JP170 chromosomal 
DNA and 1 6 ng of Clone 1 plasmid DNA was digested with EcoRl and JMm and tire digests 
were electrophoresed on a 1% agarose gel. The DNA was capillary transferred onto a Nytran 
Plus membrane (Schleicher and Schuell, Keene, NH) following the manufacturer s mstrucuons. 
The membrane was then probed following the Genius System instrucuons. 

TheSoumemhybridizationresutodemonstratedtiratthelB^l probe hybndrzed wrtir 
, 2bandsof 1800and 1400 bp from the EcoRI digested chromosomal DNA and with 2 bands of 
approximately 2000 and ,800 bp from the ScoRl digested Clone 1 DNA. The 2B/291 probe 
also hybridized with 2 bands of 2000 and 1800 bp from the tfnrdlH digested chromosomal 
DNA and with 1 band of approximately 2000 bp from the fflndll, digested Clone 1 DNA 
These results indicated that Clone 1 did no, contain tire entire gene since only the smgle 2000 
s bp band hybridize* with tire 2B/291 probe. Sequencing of the HMm fragment from Clone 1 
suggested i, contained a partial open reading frame which contained 1200 bp of the 5 end of 
tire protease gene, based on homology to the protease disclosed in JP 4197182. 

Since tire Soutirem hybridization results indicated that tire 3' end was located on ^an 
,800 bp Jfedm fragment, a new Hbrary was constructeti. Bacillus JP.70 chromosomal DNA 
3 o was digested with Hindfil and the digest electrophoresed on a 1% agarose gel Fragments 
ranging in size from 1500 bp to 2200 bp were excised atrd purified using a QIAqurck Gel 
Extraction Ki, (Qiagen, Chatsworth, CA). These fragment were tiren liga,ed into tire AMU 
site of pUCH8. E. col, DHSa (Gibco BRL, Gaitirersburg, MD) was transformed wrtir tire 
Ugation following tire manufacturer's instructions and tra^formants were screened u.rngthe 
3S 2B/29, probe as described above. After screening 3200 transformants, 5 posrtrve transformants 
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were identified. Plasmid DNA from each of the 5 transforms was prepared using a QIAprep 
8 Plasmid Kit according to the manufacturer's instructions and digested with HmdLU The 
resulting restriction fragments were compared to Clone 1 plasmid DNA restriction frgaments 
by gel electrophoresis. All 5 clones contained fragments identical in size to the previously 
5 cloned 5' end of the Bacillus JP1 70 protease gene. 

Example 11: Isolation of the 3' end of the Bacillus JP170 protease gene by inverse PCR 

Inverse PCR was used to isolate the 3' end of the Bacillus JP170 protease gene by 
amplifying the region downstream of the chromosomal clone isolated in the library screen 
> (Clone 1) described in Example 10. Southern hybridization of chromosomal DNA showed that 
the 3' end of the gene should be contained on an 1800 bp EcoKI fragment (Example 10) Size- 
selected chromosomal DNA was prepared by digestion of the Bacillus JP170 chromosomal 
DNA with EcoBl followed by electrophoresis on a 1% agarose gel. Fragments ranging from 
approximately 1600 bp to 2000 bp were isolated using a QIAquick Gel Extraction Kit and 
eluted in 30 ul of TE. The EcoEl fragments were self-ligated in a 10 ul ligation reaction 
containing the following components: 1 ul of size-selected DNA, lx ligation buffer (Boerhinger 
Mannheim, Indianapolis, IN), and 1 unit of T4 DNA Ligase (Boehringer Mannheim 
Indianapolis, IN). The ligation was incubated overnight at 14°C. A 3 ul volume of the ligation 
mix was then digested with JMm in a 20 ul reaction to linearize the self-ligated EcoEl 
fragments between the binding sites of the PCR primers. This linearized DNA was then used as 
a template in a PCR reaction with 2 divergent primers 17011 and 17012, whose sequences 
shown below were based on the sequence of the protease gene contained on Clone 1 
1701 1 : 5 '-GTAGGTTTTCGGTTGCCCCAACTGTAATCGC-3 ' (SEQ ID NO 30) 
17012: 5 '-GGTCCTACTAGAGATGGACGTATTAAGCCGG-3 ' (SEQ ID NO:31) 

The amplification was performed using the GeneAmp Kit (Perkin-Elmer, Foster City 
CA) following the manufacturer's instructions. 

The amplification resulted in a 1700 bp PCR product. The 1700 bp product was cloned 
into pCR2.1 from the TA Cloning Kit and sequenced as previously described. Comparison of 
Ae deduced amino acid sequence with the known amino acid sequence of the protease 
disclosed in JP 4197182 indicated that the cloned inverse PCR product contained the 3' end of 
the Bacillus JP1 70 protease gene. 

Example 12: Reconstruction of the Bacillis JP170 protease gene 

The 5' and 3' ends of the Bacillus JP170 protease gene were cloned into the multicopy 
Bacillus vector pSJ2882-MCS (Figure 3) to reconstruct the Bacillus JP170 protease gene. 
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PSJ2882-MCS is derived from pHP13 (Haima e, al, 1987, Molecular General Genetics 209: 
335-342) but contains a S^l-flanked MCS, and a.so a Sstl 0.5 kb fragment contairung me 
oriT region from pUBUO. This latter fragment permits mobilization of the plasmid into 
Bacillus suoMis M64 by pLS20-mediated conjugation (Battisti - *. 1985, Journal of 

Bacteriology 162: 543-550). 

PCR-amp.ificafion from Bacillus JP170 chromosomai DNA wtth pnmers addmg new 
^crion sites aliowed Coning of the 5' and 3- fragments separate* into tine piasnud. T e 
foUowing primers were used for Are addition of a S-frrf site into the 5' Bacllus JP170 

^70Sma:^-C^^CCCGGGGATGTGTTATAAATTGAGAGGAG-3' (SEQ ID NO:32) 
17030R- 5'-CCTCGTGAAGAGAATrGAGCAACATGG-3' (SEQ ID NO:33) 

The following primers were used for fine addition of a V Noil site into fire 3" Bacillus 
JP 170 protease gene fragment: 

17097F- 5' GCG ATTAC AGTTGGGGC AACC-3 ' (SEQIDNO:34) 
• 17035NOT ^^C^GCCGCGTACTCTCATCAATTTCCCAAGC-3' (SEQ ID N ^ 5 ^ 
' 'Zoti 5,GCGGCCGCGTCATAAACGTTGCAATCGTGCTC-3-( S EQIDNO:3 ) 

The amphfication reactions were performed under fine same condittons as desenbed m 

Example* ^ ^ ^ ^ ^ ^ ^ ^ ^ rf ^ ATG 

(includingfineRB^andextendedpastuneintcmalH^dlllsite. Ibis fragment was cloned I- a 
' S ^™fragmenti»tt,*e^I-ffl™inisiteofpSJ2882-MCS. The 3' end was amphfied 
from me tfmdffl site to 192 bp downstteam of the stop codon, adding a Noil site, and was 
cloned as a HinWl-Notl fragment downstream of the 5' end. 

The anryQ promoter (fine promoter of a gene encoding a Bacillus Uchenifornis amyiase 
25 caUed BAN™, Novo Nordisk A/S, Bagsvsrd, Denmark) was PCR-ampUfied using primers 37 
and 38 listed below according to the amplification conditions described ,n Example 9: 

^Lr^AAC^CTGCAAICQA^GTTTGAGAAAAGAAG-3- W and Clal 

sites underlined, respectively) (SEQ ID NO:37) 

30 "nAGCTCCATmCTTATACAAATTATATTTTACATATCAG-3- (Ss,l site 

underlined) (SEQ ID N0.38) 

The amyL promoter (fine promoter of a gene encoding a Bacillus antyloLauefaaens 
a^lase ealied TERMAMYL™, Novo Nordisk A/S, Bagsva=rd, Denmark) was PGR amphfied 
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as described in Example 9 from pPL1759 (Figure 4), a pUBl 10-based plasmid containing the 
amyL promoter. Primer termlSFi was used in the amplification to add an Sfii site ,o the 5< end 
and primer 2iSfi was used to add a Sad site to the 3 • end- 

, Nt39) temiISFi: 5 '- CCAGGCCrrAAGGGCCOCATOCG TCCTTCrTTG.3- (SEQ ID 

Primer2iSfi: 5--CCAGAGCTCCTTTCAATGTAACATATGA-3'(SEQIDNO-40) 

The amyQ promoter (BAN™ promoter) and amyL promoter (TERMAMYL™ 
promoter) were then inserted upstream of the reconstructed gene into the |Uri sites as Sfil- 
£c/136II (blunt) fragments to produce pi 70BAN and pi 70TERM, respectively. 

Example 13: Sequence analysis of the BacUlis JP170 protease gene 

The reconstructed Bacillus JP170 protease gene was sequenced using an Applied 
B.osystems Model 377 Sequencer according to the manufacturer's instructions 
frame * of "constructed protease gene revea!ed an open reading 

SlD NO 42^ " f 0 ™. 1 " H8Ure 5 (SE< 5 m "educeo amino acid sequence 

SEQ ID N o.42) as shown m F.gure 5 consist of 641 amino acids including a 33 amino acid 

^ T 3 5 "** PrePr ° re8i ° D - 11,6 -*» *e signal 

sequence and prepro region, has 77% identity to the pretense disclosed in JP 4197182, and the 
deduced mature protein has 89% identify to the same pretease (Figure 6, SEQ ID NO-43) aa 
determmed by GeneAssist software (PE Applied Biosystems, Inc., Foster City, CA) an" 
I^rGene software (DNASTAR, Inc., Madison, WI). Notably, i, also contains the C-term J 
extension seen ,n the protease disclosed in JP 4197182. The best homology in the protein 
datebase was to -bBK*. precursor where me homology was only 35% identity (Figure 6 SEQ 
ID NO:44) as determined by GeneAssist 

Example 14: Transformation of Bacillus sublilis with pl70BAN and „170TERM 

Plasmids P170BAN and pHOTERM were transformed into competent colls of Bacillus 
submit strain A164A5 according to the method of Petit e, a,., ,990, supra, and selected for 
chloramphenicol resistance. 

Transfotmants were patched onto TBAB plates containing 5 pg of chloramphenicol per 
ml and 1% milk and incubated at 37»C overnight to test for protease production. Strains 
^8«therpl70BANo,p,70T^ 

the vector only, which made no halos. g 
Plasmid pI70BAN was also transformed into competent cells of Bacillus sublilis strain 
168 aprE- nprE- amyE- spoIlE^W as described above. One transfotman, designated " 
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Bacillus subtilis LC20 produced zones on 1% milk-TBAB plates. 

Example 15: Integration of P LC20 and P LC21 into Bacillus subtilis 

To construct the integration vector P CAsub2, the neomycin resistance gene of P PL2419 
(Figure 7) was excised by digestion with Bell and BgRl and replaced with the chloramphemcol 
acetyltransferase (cat) gene-containing M fragment from pMIHOl (Youngman et al, 
1984 Plasmid 12: 1-9) to create plasmid pPL2419-cat. {BarrM sticky ends are compatible 
with '&* and BgRl sticky ends.) Then, the multiple cloning site (MCS) of P PL2419-cat was 
replaced with a new MCS containing Sfil and Notl sites created by annealing the two 
oligonucleotides together shown (SEQ ID NO:45 and SEQ ID NO:46): 
<> a rtrTTf;nr.r.TTA AGGGCCC GATATCGGATCCGCGGCCGCTGCAG^TAC-3' 
(Hmdffl and Kpnl compatible sites are underlined, Sfil and tfofl sites are double-underlined) 
(SEQIDNO:45) 

5..CTGCAGCGGCCGCGGATCCGATATCGGGCCCTTAAGGCCA-3' (SEQ ID NO:46) 
The annealed oligonucleotides were ligated to HmdlH and Kpnl-cut P PL2419-cat to generate 
p2419MCS5-cat. Then, nucleotides 942 to 1751 of amyE (GenBank Locus BSAMYL, 
accession numbers V00101, J01547) were PCR-amplified using primers containing Notl and 
Kpnl (AspllB) linkers (SEQ ID NO:47 and SEQ ID NO:48) and Bacillus subtilis strain A164 
A5 chromosomal DNA as template, and inserted into Notl and ^718-digested P 2419MCS5, 
, generating integration vector P CAsub2 (Figure 8), CAsub referring to chloramphemcol 
resistance, amylase homology, for use in a subtilis host. 

5 '-GCGGCCGCG ATTTCC AATG AG-3 ' (nucleotides added to create Notl site are underlined) 
(SEQIDN0.47) 

5 '-GGTACCTGCATTTGCCAGCAC-3 ' (nucleotides added to create Asp 7181 site are 
5 underlined) (SEQ ID NO:48) 

Integration of this vector alone into Bacillus subtilis 168 and plating on starch azure overlay 
plates showed complete elimination of amylase activity. 

The amyQ promoter and amyL promoter Bacillus JP170 protease gene cassettes were 
isolated from the pSJ2882-MCS-based plasmids P 170BAN and pHOTERM and cloned into the 
J0 Sfi\-Not\ sites of the Bacillus integration vector P CAsub2 to produce P LC20 and P LC21, 
respectively. pSJ2882-MCS is unable to replicate independently in Bacillus and therefore must 
integrate into the chromosome to be stably maintained. It contains a truncated version of the 
amyE gene which serves as a source of homology, and integration by a single crossover results 
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in insertion of the entire plasmid at the amyE locus. 

P LC20 (amyQ promoter) and pLC21 (amyL promoter) were transformed into competent 
cells of Bacillus subtilis strains A164A5 and A1630A5 according to the method of Petit et al. t 
1990, supra. The integrants were designated Bacillus subtilis A164A5-B-JP170, Bacillus 
subtilis A164A5-T-JP170, Bacillus subtilis A1630A5-B-JP170, and Bacillus subtilis A1630A5- 
T-JP170 where B is the BAN- promotei , x is ^ TERMAMYL™ promoter, and JP170 is the 
protease gene. Chloramphenicol-resistant transformants of each were tested for protease 
production on 1% milk-TBAB plates. 

All transformants tested made halos that were larger and more distinct than the 
multicopy P SJ2882MCS-based transformants. The presence of the Bacillus JP1 70 protease and 
integration at the amyE locus were verified by PCR as described in Example 1 6. 

Example 16: Integration screening 

Putative integrants described in Example 15 were screened by PCR to verify the 
presence of the protease gene and to verify integration into the amyE locus. Genomic DNA 
from the putative integrants was prepared by resuspending a single colony in 100 ui of H 2 0 
freezing in dry ice for 5 minutes, followed by boiling for 5 minutes, then repeating the cycle 3 
times. Suspensions were centrifuged for 10 minutes. PCR reactions using 5 ^1 of supernatant 
were set up as described in Example 9 using the following protease primers- 
17020: 5 ' -GCTGC ACTATTGTCTTCTG-3 ' (SEQIDNO:49) 
17025: 5'-CAGCAACTGCTACAATCTG-3' (SEQIDNO:50) 

The following primers were used for screening integration: 
17037: 5'-GTGCAGGCTTACAATGTACCAG-3' (SEQIDNO:51) 
LCamyREV: 5'-GCATTTACCTGGCTCCAATGATTC-3 ' (SEQ ID NO:52) 

If the protease was present in the strain, then amplification with the protease primers 
would result in a 665 bp band. If the protease gene was integrated at the amyE locus, then 
amplification would result in a 1555 bp band using the integration primers. 

Agarose gel electrophoresis of the resulting PCR products yielded a 1555 bp band 
coii&ming the integration of the Bacillus JP170 protease gene into the chromosome. 

Example 17: Amplification of the Bacillus JP170 protease gene expression cassettes 

The amyQ promoter (BAN™ promoter) and amyL promoter (TERMAMYL™ 
promoter) Bacillus JP170 protease gene cassettes were amplified in the integrated strains 
Bacillus subtilis A164A5-B-JP170, Bacillus subtilis A164A5-T-JP170, Bacillus subtilis 
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A1630A5-B-JP170, and Bacillus subtilis A1630A5-T-JP170 strains. This was achieved by 
plating on TBAB plates containing successively higher chloramphenicol concentrattons of 15, 
30, 45, 60, and 80 ug per ml. 

The stability of the protease integration after amplification was confirmed by patctong 
on TBAB plates containing 1% milk at each chloramphenicol concentration. Producuon of 
halos showed 100% stability. After a few hours, amplified strains produced halos comparable 
in size to halos produced overnight by unamplified strains. 

Example 18: Copy number determination ,,.„.„ ro ,, n 

Southern blots were performed to estimate the copy number of the Bacllus JP170 
protease gene expression cassettes in the amplified versus the unamplified versions of Bacllus 
subMis A164A5-T-JP170 and Bacillus subtilis A1630A5-B-JP170 strains. Genomic DNA 
prepared from the strains according to the Bacteria. DNA Isolation Protocol descnbed m the 
Qiagen Genomic DNA Handbook (Qiagen, Chatsworth, CA) according to the manufacturer s 
i^hons was cu, with M ran on a 0.8% agarose gel, b.otted using PosiBlot Pressure 
Blotter and Pressure Control Station (Stratagene, La JoUa, CA), and hybridtzed and detected 
using pmbe 1/291 (Example 9) and the DIG System Hybridization and DM. D 
(Boehringer Mannheim, Indianapolis, IN) acoording to the manufacturers' insttocttons. Usvng 
the Stormlmaging System Mode, 860 (Molecular Dynamics, ^^A) a^dmg »ti« 
o manufactuter'statmctions.itwasestin^^^ 
in each strain. 

The Southern blot of the amplified Bacillus subtilis A164A5-T-JP170 showed a 300 bp 
deletion in the amyL promoter (TERMAMYL™ promoter) Bacillus JP170 protease gene 
cassette. However, SDS-PAGE analysis using Novex 1 4% Tris-Glycine Precast Gel- 10 mmX 
, 5 15 well and Novex DryEase Mini Gel Drying System (Novel Experimental Technology San 
Diego, CA) according to the manufacturer's instructions showed that the expression of the 
Bacillus subtilis JP170 protease gene was not affected by this deletion. 

Using a series of PCR reactions, it was established that the deletion is 5' of the Bacllus 
JP170 protease gene and encompasses the amyL promoter. The PCR reactions were performed 
30 using several primers described supra and the following primers: 
17021 • 5 ' -CC AAT AGT AG AAGG ACTG-3 ' (SEQ ID NO:53) 

RB1701: 5 ' -CTTC AG ATTGG AAAGCG AGCGG ACGG AATC ATTG ATC-3 ' (SEQ ID 
NO:54) 

RB1702- 5'-CTCAGCTTGAAGAAGTGA-3' (SEQIDNO:55) 
35 RB1703: 5'-GAAGCAGAGAGGCTATTG-3' (SEQ IDNO:56) 
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RB1704: 5'-GAAAATATAGGGAAAATGT-3' (SEQ ID NO:57) 

The PCR reactions were performed using the following primer pairs: 1703 7/1 7036Not 
TermlSfi/RB1701, RB1702/.7021, RB1703/17021, RB1704/17021, 17036Not/TermlSn' 
!™ ff>™ 2 '^3-48Rev./,7021 with5 Pg of 40 pg/m. template D N A,25 m1 ' 

5 TT, <P ^" Elmer ' FOSter Chy - CA> ^ 15 «ft ' Ml of .0 mM 

MgCl 2 , 5 pi of 1 mM dNTP mix, 2.5 pi of 5 pmol/pl of each primer pair, 0.125 pi of 5 U/ul 
AmphTaq Gold polymerase (Perkin-Elmer, Foster City, CA), and 6.375 pi of deionized water 
were used in each PCR reaction. Reactions were incubated in a Stratagene Robocycler 40 
programmed for 1 cycle a. 96°C for 10 minutes, 30 cycles each at 96"C for 1 minute, 55°C for 
> 1 minute, and 72°C for 1 minute, and 1 cycle at 72°C for 5 minutes. 

Since the amyL promoter is not present in the amplified Bacillus subtilis A164A5-T 
JP170, tire P UC19 sequence (lacZ promoter) found upstream of the amyL promoter probably 
served as the driving promoter for the Bacillus JP170 gene. 

Reamplification of Bacillus subtilis A164A5-T-JP170 by plating on increasing 
concentrations of chJoramphenicol as described in Example 17 was performed in order to 
obtam a deletion-flee promoter/protease cassette. Genomic DNA from Bacillus subtilis 
A.64A5-T-JP170 was prepared by resuspending a single colony in .00 pi of deionized water 
boiling for 5 minutes, followed by freezing f or 5 Bim ^ men ^ ^ ' 

The suspensions were centrifuged for 10 minutes. The PCR reactions were set up as mentioned 
above usmg 5 pi of supernatant as template DNA and the primer pair TermlSfi/17021 At a 
chloramphenicol concentration of 20 pg/ml, i, was shown that a deletion was present 'in this 
newly amplified version. 

Reformation of Bacillus subtilis A164A5 with pLC21 was performed in order to 
obtam a deletion-flee promoter/protease cassette. PCR using the primer pair M13-48 
Reverse/17021 as described above, it was shown that this unamplified strain was deletion free 
This strain was amplified by successive p.ating on increasing concentrations of 
chloramphenicol as described in Example 17. PCR reactions using the primer pair M13- 
4 8 Reverse/1702. showed that the amplified version (up ,„ 40 pg/ml chloramphcLl) was 
deletion free. However, the deletion-free amplified version was difficult to grow and produced 
very small ha>os on 1% milk-TBAB plates when compared to the amplified strain containing 
the amyL deletion. 

The Southern blot of Bacillis subtilis A1630A5-B-JP170, using the same protocol as for 
Bacillus sub.ilis Al 64A5-T-JP1 70, did no, show any deletion in the promoter/protease cassette. 
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Example 19: Expression of Bacillus JP170 protease to shake flasks 

Bacillus subiilis A164A5-B-JP170, Bacillus subiilis A164A5-T-JP170, Bacillus sub„l.s 
A1630A5-B-JP170, and Bacillus subiilis A1630A5-T-JP170 strains were cultivated m shake 
flasks at 37»C and 250 rpm for 5 days containing 50 nr. of PS-1 medium composed of ION 
llse 4% soybean flour, 0.42% anhydrous disodium phosphate, and 0.5% calctum carbonate 
demented whh 5 pg of chloramphenico, per ml. In addition, Bacillus sub,,, 
A164A5::pCAsub2 containing me integration vector was used as a negative control. 

The stability of the protease integration was confirmed via casern plaung at the 
beginning and at the end of each assay as described in Examp.e 18. In each msjance, the 

nhserved within a few hours). 

IdS-PAGE analysis using Novex Precast Ge.s as described in Example 18 was 
performed to determine the expression .evels in bom assays. When Ore four shams were 
compared, it was observed mat Bacillus subiilis A164A5-T-JP170 expression was greater 
, compared to Bacillus subiilis A164A5-B-JP170. The opposite was true for Bacillus sub,,,. 
A1 630A5 strain where expression of Bacillus subUlis A1630A5-B-JP170 was greater combed 
to Bacillus subiitis A1630A5-T-JP170. Tke negative controi produced no detectable JP170 
protease. 

,o Example20: Comparison of Bacillus sp. JP170 protease to SAVIN ASE™ 

Wash tests were performed to compare the efficacy of the Bacillus sp. JP 1 70 protease 
(SP444) to SAVINASE™. The Bacillus sp. JP170 protease was obtained as descnbed m WO 
88/01293 SAVINASE™ was obtained from Novo Nordisk A/S, Bagsva^rd, Denmark. 
The experimental conditions of the wash tests are enumerated below in Table 2. 
Table 2 



Detergent Dose 
pH 

Wash Time 
Temperature 
Water Hardness 



Protease Model 


Koso Top 


Detergent 


Detergent 


3gA 


0.7 g/1 


9.5 


10.5 


15 minutes 


10 minutes 


15°C 


20°C 


5.6°dH 


2.8°dH 
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Enzyme Concentration 
Test Method 
Swatch/Volume 
Test Material 



PCT/US98/12005 

~lmMCa 2 7Mg 2+ -0.5 mM Ca 2 7Mg 2+ 

0,3,6,9, 12, 15, 30, 60,90 nM 

Miniwash 

5 swatches (2.5 cm)/50 ml 
Grass on cotton (rinsed in water) 



Koso Top (Lion Corp., Tokyo, Japan) is a commercial detergent, and therefore the 
protease in the detergent was inactivated before the wash tests were performed. The protease 
io was inactivated by heating a solution of 10 g of detergent in 100 ml deionized water to 85°C in 
a microwave oven. 

The model detergent was composed of 25% STP (Na 5 P 3 O I0 ), 25% Na 2 S0 4 , 10% 
Na 2 C0 3 , 20% LAS (Nansa 80S), 5% NI (Dobanol 25-7), 0.5% Na 2 Si 2 O s 0 5% 
carboxymethylcellulose (CMC), and 9.5% water. The pH was adjusted to 9.5. 

Measurement of remission (R) on the test material was performed at 460 nm using an 
Elrepho 2000 photometer (without UV). The measurements were fitted to the expression: 

AR = {[(a)(AR max )(c)]/[ AR^ 4- (a )(c)]} + b 
The improvement factor (IF) was calculated using the initial slope: IF = a/a^ AR is the wash 
effect of the enzyme in remission units; a is the initial slope of the fitted curve (c->0)- a* is the 
20 initial slope for the reference enzyme; b is the intersection of the fitted curve and the y-axis- c is 
the enzyme concentration in nanomoles active enzyme per liter, and AR_ is the theoretical 
maximum wash effect of the enzyme in remission units (c-*»). 

The results of the wash tests demonstrated that the JP170 protease possessed an IF of 
6.2com P aredto 1.0 for SAVINASE™ in the model detergent as shown in Table 3. TheJP170 
25 protease also had an IF of 4.6 compared to 1 .0 for SAVINASE™ in the Koso Top detergent. 

Table 3 



Protease 


Concentration 


Improvement factor 








Model Detergent 


Koso Top 


SAVINASE™ 


8.1x10"^ 


1.0 


1.0 


JP170 (SP444) 


3.77 x 10" 5 M 


6.2 


4.6 



The wash 



results in the model detergent shown in Figure 9 demonstrated that the JP170 
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protease (SP444) perfonned significant* bener than SAVINASE™ in removing grass stain 

^ ^ wash results in the Koso Top detergent shown in Figure 10 demonstrated mat the 
miO protease (SP444) perfonned significantly better man SAVINASE™ in removing grass 
stain from cotton. 



Deposit of Biological Materials 

The following bio.ogical material has been deposited nnder the terms of the Budapest 
„ Treaty with the Agricultural Research Service Patent CuUure Collection, Northern Regronal 
RTITcenter, »» University Street, Peoria, lUinois, 6,604, and given the followmg 

;~ nUmber AccessionNumber Date of Deposit 

bZL sub.Ms LC20 (plTOBAN) NRRL B-21680 April 4, 1997 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 

(i) APPLICANT: Sloma, Alan 

Lynne, Christianson 

(ii) TITLE OF THE INVENTION : Nucleic Acids Encoding A Polypeptide 

Having Protease Activity 

(iii) NUMBER OF SEQUENCES: 57 

(iv) CORRESPONDENCE ADDRESS- 

b STREET: 405 Lexington Avenue 

(C) CITY: New York 

(D) STATE: NY 

(E) COUNTRY: USA 

(F) ZIP: 10174 

(V) COMPUTER READABLE FORM- 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 
(vi) CURRENT APPLICATION DATA- 

(A) APPLICATION NUMBER : to be assigned 

(B) FILING DATE: 12-JUN-1998 aSSlgned 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION- 

(A) NAME: Starnes, Robert L 

(B) REGISTRATION NUMBER- 41 324 

(C) REFERENCE/DOCKET NUMBER: 5251.200-US 



(ix) TELECOMMUNICATION INFORMATION- 

(A) TELEPHONE: 212-867-0123 

(B) TELEFAX: 212-878-9655 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
GAGCTCACAG AGATACGTGG GC 

22 

<2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 
GGATCCACAC CAAGTCTGTT CAT 

23 
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(2) INFORMATION FOR SEQ ID NO: 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
-(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GGATCCGCTG GACTCCGGCT G 

(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
AAGCTTATCT CATCCATGGA AA 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
AAGCTTAGGC ATTACAGATC 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 
CGGATCTCCG TCATTTTCCA GCCCGATGCA GCC 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GGCTGCATCG GGCTGGAAAA TGACGGAGAT CCG 



(2) 



INFORMATION FOR SEQ ID NO: 8: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 8: 
GATCACATCT TTCGGTGG 

18 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CGTTTATGAG TTTATCAATC 

20 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
AGACTTCCCA GTTTGCAGGT 

20 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CAAACTGGGA AGT CTCGACG GTTCATTCTT CTCTC 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TCCAACAGCA TTCCAGGCTG 

20 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 13 : 

29 

GCGAATTCTA CCTAAATAGA GATAAAATC 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 14 : 
GTTTACCGCA CCTACGTCGA CCCTGTGTAG CCTTGA 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
TCAAGGCTAC ACAGGGTCGA CGTAGGTGCG GTAAAC 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO -.16: 
GCAAGCTTGA CAGAGAACAG AGAAGCCAG 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CGTCGACGCC TTTGCGGTAG TGGTGCTT 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 36 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
CGCGGCCGCA GGCCCTTAAG GCCAGAACCA AATGAA 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
TGGCCTTAAG GGCCTGCGGC CGCGATTTCC AATG 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
GAAGCTTCTT CATCATCATT GGCATACG 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
AAGCTTTGAA TGGGTGTGG 

19 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 36 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 
CCGCTTGTTC TTTCATCCCC TGAAACAACT GTACCG 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

34 

CAGTTGTTTC AGGGGATGAA AGAACAAGCG GCTG 

(2) INFORMATION FOR SEQ ID NO: 24: 

<i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

18 

CTGACATGAG GCACTGAC 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
A sn Asp Val Ala Arg Gly He Val Lys Ala Asp Val Ala Gin Asn Asn 
Pne Gly Leu Tyr G^y Gin Gly Gin lie Val Ala Asp Thr Gly Leu Asp 

20 25 
Thr Gly Arg Asn Asp Ser 
35 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
Gly Ala Ala Asp Val Gly Leu Gly Phe Pro Asn Gly Asn Gin Gly Trp 

1 5 
Gly Arg Val Thr Leu Asp Lys 
20 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO -.27: 
CCCCANCCNT GNTTNCCNTT NGGNAANCC 

(2) INFORMATION FOR SEQ ID NO -.28: 
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U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 3 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 
GGNATNGTNA ANGCNGANGT NGCNCANAAN AANTTNGG 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 
TANGGNCANG GNCANATNGT NGCNGTNGCN GANACNGG 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 30: 
GTAGGTTTTC GGTTGCCCCA ACTGTAATCG C 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: 
GGTCCTACTA GAGATGGACG TATTAAGCCG G 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
CTCCCCCGGG GATGTGTTAT AAATTGAGAG GAG 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS- 
(A) LENGTH: 27 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
CCTCGTGAAG AGAATTGAGC AACATGG 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
GCGATTACAG TTGGGGCAAC C 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
GCGGCCGCGT ACTCTCATCA ATTTCCCAAG C 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
GCGGCCGCGT CAT AAAC GTT GCAATCGTGC TC 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
TTTGGCCTTA AGGGCCTGCA ATCGATTGTT TGAGAAAAGA AG 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
TTTGAGCTCC ATTTTCTTAT ACAAATTATA TTTTACATAT CAG 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
CCAGGCCTTA AGGGCCGCAT GCGTCCTTCT TTG 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40: 
CCAGAGCTCC TTTCAATGTA ACATATGA 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 



yaj ixfjs: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 




60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
72 0 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
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ATTTGGAATC ®"*^9^EE^ GAAACAGTAG CTTTAGATAA AAAGCAAAGA AGTAAAGAAG US 

GAGGAGCTTC TGAATTAGTA OAAAMTAG ™AGAU TAATGAT GTATTATACG 1440 

TACGTTTAAG AGGATTGGAA ^^CCC ATGACGTGGC CCGTGGCATT GTGAAAGCAG 1500 

TAACCCCAAA GCCTGAATAC ™TTTTGA JTGACGl G ATTGT AG C A GTTGCTGATA 1560 

ACGTCGCACA AAATAACTTT JGJTTATATG GACAAGGA TCCGC QGTAAGATTA 1620 

CTGGGCTTGA TACAGGAAGA AAJGACAGTT JGAT6WVI TCCAAATGGA CATGGAACCC 1680 

CCGCACTATA TGCACTGGGC AGAACGAATA ACGCCAA^ GATGGCACCG CAAGCCAATC 1740 

ATGTTGCTGG ATCTGTGTTA GGAAATGCTA CAAATAAA££ AGGACTACCT GCTAATCTAC 1800 

Stctttca atctattatg gatagtggtg gagggctggg aggactacc tcatgggggg 1860 

AAACATTATT CAGTCAAGCA TATAGTGCTG gGCGAGAAT TCMJC GTGAGAAAAA 1920 

CTCCAGTAAA CGGTGCCTAT ACGACAGACT CTCGAAATU1 GQT ACAATCA GTG " 

ATGATATGAC GATTCTTTTT GCGGCCGGAA ATGAGGGACC AGGT^^ CGTCCAAQCT 
CACCAGGAAC AGCAAAAAAT GCGATTACAG ™™ AA £ p CTCTTC ACGA GGTCCTACTA 
TCGGATCTTA TGCGGATAAT JTTAACCATG ™™gTT gC^^ TCTGCTAGAT 
GAGATGGACG TATTAAGCCG GACGTCATGG TAGTAAATAT GCCTACATGG 

CATCATTAGC TCCAGATTCC TCATTCTGGG CAftAUUHi TGCACAATTA AGGGAGCATT 
GTGGTACTTC TATGGCTACT CCAATTGTAG CAGGTAATGT ^ TTAATTGCAG 
TTGTGAAAAA TAGAGGGGTA ACTCCTAAGC CTTCCCTTT^ ££££ AGAGTAACGT 
GTGCTGCGGA TGTTGGACTT ^GCTTTCCAA ATGGTAACCA JGGA^^ 
TAGATAAATC CCTAAATGTC GCATTTGTGA AAAAATATCA CTTGTTTGGT 

AAGCAACATA TTCGTTTACG GCTCAAGCTG GTAAA GAATGATTTA GACTTAGTAA 

CAGATGCACC AGGTAGCACG ACGGCATCAC JAACTll TACAGCACCG TATGATAACA 

TCACTGCACC AAATGGAACT AAATACGTCG ^J^j^^ T GCTCCTCAA AGCGGAACGT 
ATTGGGATGG CAGAAACAAC GTGGAAAATG ^^TCAA TCTTTAGCGA 
ATACAGTCGA AGTGCAGGCT JACAATGTAC ™g*»gjg GCAAA^^ ^ CTTTTTTT 

SSSS SS SSSSSS GAAAAGCAAC AAAGTATGCG 3000 



AAA 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 641 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Mar Arg Lys Lys Gly Sar Lys Arg Val Pha Lau Sar Val Lau S« Val 

aL Ala Lau Lau Sar Ser Val Ala Lau Sar sar Pro Sar Thr Ha Gly 

Ma Asn AS„ IL Glu Lau Asp Pha Lys Gly lis Glu Thr Lau Thr Lau 

Glu Lys Ala Ala Thr Lys Gin Gly Lys Thr Gly Lys Ala sar Pha Lau 

Val aL sar Glu Asn Val Lys Ha Pro Lys sar Ha Glu Lys Lys Lau 

IL val val Pro Ala Asp Asn Lys Lau Tyr 11a Val Gin Pha Asp Gly 

Pro Ila Lau Glu Glu Thr Gin Lau Gin Lau Glu Lys Thr Gly Ala Lys 

lie Lau Asp Tyr Ila Pro Asp Tyr Ala Tyr Ila Val Glu Tyr Asp Gly 

Mp val Lys Ala val Thr Asa Ala Ila Ala His Lau Glu sar val Glu 



p7o £°r Lau Pro Lau Tyr Lys Ila Asp Pro Gin Lau Pha sar Arg Gly 
HI sar Glu Lau val Glu Thr Val Ala Lau Asp Lys Lys Gin Arg Sar 
165 170 



1980 
2040 
2100 
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2220 
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2340 
2400 
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2580 
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2820 
2880 
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Lya Olu Val tog Leu Arg Gly Lau Glu Gin He Ala Gin Tyr Ala Thr 
Aan As„ top Val Leu Tyr Val Thr "o Lys Pro Glu Tyr G?u Val Leu 
Asn Asp Val Ala Arg Gly He Val Lys Ala Asp Val Al°a Gin Asn Asn 
Phe Gly -Leu Tyr Gly Gin Gly Gin He Val Ala Val° Ala Asp Thr Gly 
Lau Asp Thr Gly Arg Asn Asp Ser Ser Met Sis Glu Ala Phe Arg a" 
Lys He Thr Ala Leu Tyr Ala Leu Gly Arg Thr Asn Aan Ala Aan Asp 
Pro Asn Gly His Gly Thr His Val Ala Gly Ser Val Leu al'y Aan Ala 
Thr Aan Lys Gly Met Ala Pro Gin Ala Asn Leu Val Phe Gin Ser He 
Met Asp ser Gly Gly Gly Lau Gly Gly Leu Pro Al°a Asn Leu Gin Thr 
Lau Phe sar Gin Ala Tyr Ser Ala Gly Ala tog He His Thr ton III 
Trp Gly Ala Pro Val Aan Gly Ala Tyr Tto Thr Asp Ser Arg Ian Val 
Asp Aap Tyr Val tog Lya ton top Met Thr He Lau Ph. HI M a Gly 
ton Sin Gly Pro Gly Ser Gly Thr Ha Sar Ala Pro 3g Thr Ala Lys 
ton Ala Ha Thr Val Gly Ala Thr Glu ton Leu tog Pro Ser Phe Gly 
Ser Tyr Ala top ton Ha ton His Val Ala "n Phe Ser Ser Arg G?y 
Pro Thr tog top Gly Arg He Lys Pro As" Val Met Ala Pro ajy Thr 
Tyr Ha Lau Ser Ala tog Ser Ser Leu Ala Pro Asp Ser Ser Phe Trp 
Ala ton Hie top Sar Lys Tyr Ala Tyr Met Gly Gly T hr Ser Met Ala 
Thr Pro He Val Ala Gly ton Val Ala Gin Leu £° Glu His Phe Val 
Lys ton Arg Gly Val Thr Pro Lys Pro Ser Leu Leu Lys Ala Ala Leu 
He Ala Gly Ala Ala Asp Val Gly Leu Gly Phe Pro ton Gly ton Gin 
Gly Trp Gly tog Val Thr Leu Asp Lys Ser Leu ton Val III p he val 
ton Glu Thr ser Pro Lau Ser Thr Ser Gin Lys Ala Tnr Tyr Sar Pha 
Thr Ala Gin Ala Gly Lys Pro Leu Lys He Ser Leu Val Trp Ser top 
Ala Pro Gly Ser Thr Thr Ala Ser Leu Thr Leu Val Asn top Leu 5£ 
Leu val Ha Thr Ala Pro ton Gly Thr Lya Tyr Val Gly ton As ? P Phe 
Thr Ala pro Tyr top Asn ton Trp top Gly tog ton ton Vel° Glu ton 
Val Phe He Asn Ala Pro Gin Ser Gly Thr Tyr Thr val Glu Val Gin 
Ala Tyr ton Val Pro Val Ser Pro Gin Thr Phe III Leu Ala He Val 
His 635 640 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 63 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

Me , Lys «, ^ w- « ~ - ;r val val Ma ser " a T 

xi. 1 Ala Ser Val Me, VaX ser Ser Pro - Ser G ly AX. Asp - 
Gln val Asn Phe Aan Sly VaX Lys Ser Leu Slu Asn Ala Ser Leu * 
x, s Pro xle ser Ser Sly Olu S. Ser Phe leu VaX Asp Thr OXu « 
xL Sn XXe Pro lys SXy S. SXn Lys Lys Leu SXu AXa V al Sin Lys 

« Mn - - y S. - - « - - « t «. -» 

Glu „ Lys ox y 2» - ser Leu sly v., ser xle Leu asp Tyr - 
Pro MP Tyr i£ - u. g; ^ « *| w - "° 

ser *hr 52 His ser Va, SXu aL va! Sin Pro Phe X.eu Pro Leu Tyr 
Lys XXe Asp Pro 01» Leu Su Thr Lys SXy Ma Ser SXn Leu Va! 
it! va! Xle leu Asn SS Lys His SXu Asn Lys Asn Me, Lys Phe Thr 
Gly ,eu Asp Slu III vax SXn Tyr Ala Ala Asn Asn Asp VaX Leu Tyr 
xxl S er Pro iys Pro SXu Tyr Slu K «e, Asn Asp -1 - Arg «Y 
Ile val iyl Ala Asp Val Ala SSS Asn - Tyr SXy Leu Tyr Sly sin 
G1 y sin Leu va! Ala VaX AXa Asp Thr SXy Leu Asp Thr G Xy Arg « 
Si ser ser Me, His SS AX, - - «*" °« SB ^ 

Ja Leu G1 y Arg SS « « "J - « ~» SS ~ 
His vax AXa ?Xy Ser VaX leu SXy Asn AXa leu Asn ,ys sly Met Ala 
pro Sin Sa M u Leu VaX Phe S£ Ser XXe Me, ASP Ser Ser OXy OXy 
« l!y SXy Leu Pro ser IS Leu M n Thr Leu Phe Ser OXu R Xa Trp 
J£ ». SXy «a *rg l" ^ Thr A s„ Ser Trp S Xy ,Xa Pro VaX » 
325 „ » m val Asp Slu Tyr val Arg Asn 

G ly Ala Tyr Thr Ala Asn ser Arg GXn val Asp y ^ 

Rsn Asp Me, T„r val Leu Phe Ala Ala Sly Asn Slu Sly Pro Asn ser 
Gly Thr XXe ser AXa Pro SXy JS AXa ,ys Asn Ala xle Thr val Sly 
Ma Th°r Slu Asn Tyr Arg PrI Ser Phe Sly Ser Xle Ala Asp Asn Pro 
IS His Xle Ale sin III Ser ser Arg Sly Ala Thr Arg Asp Sly Arg 
405 , ~ ^.1,, ihr Phe lie Leu ser Ala Arg 

lie Lys Pro Asp val Thr Ale Pro sly Thr Phe lie u ^ 

ser ser Leu AXa Pro Asp ser ser Phe Trp Ala Asn Tyr Asn ser x-ys 
.yr AXa & Me, SXy SXy Thr Ser Me, AXa Thr Pro XXe VaX Ala SXy 
As „ vax AXa SXn Leu Arg slu His Phe Xle Lys Asn Arg Sly XXe Thr 
& ,ys Pro ser Leu ill Lys AXa AXa Leu Xle Ala Sly Ala Thr Asp 
VaX OXy Leu SXy Tyr Pro Ser SXy Asp So sly Trp SXy Arg VaX Thr 
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500 








505 


Leu 


Asp 


Lys 


Ser 


Leu 


Asn 


Val 


Ala Tyr Val Asn 


Ala 




515 










520 


Thr 
530 


Gly 


Gin 


Lys 


Ala 


Thr 
535 


Tyr Ser Phe Gin 


Pro 
545 
Ala 


Leu 
Ser 


Lys 
Tyr 


lie 
Thr 


Ser 

Leu 
565 


Leu 
550 
Val 


Val 
Asn 


Trp Thr Asp Ala 
Asp Leu Asp Leu 


Asn 


Gly 


Gin 


Lys 


Tyr 


Val 


Gly 


Asn Asp Phe Ser 








580 








585 


Trp 


Asp 


Gly 


Arg 


Asn 


Asn 


Val Glu Asn Val 


Gin 




595 










600 


Ser 


Gly 


Thr 


Tyr 


lie 


lie 


Glu Val Gin Ala 


Gly 
625 


610 










615 


Pro 


Gin 


Arg 


Phe 


Ser 
S30 


Leu 


Ala lie Val His 
635 
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510 

Glu Ala Thr Ala Leu 
525 

Ala Gin Ala Gly Lys 
540 

Pro Gly Ser Thr Thr 
560 

Val lie Thr Ala Pro 
575 

Tyr Pro Tyr Asp Asn 
590 

Phe lie Asn Ala Pro 
605 

Tyr Asn Val Pro Ser 
620 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 418 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
Met Lys Arg Ser Gly Lys He Phe Thr Thr Ala Met Leu Ala Val Thr 
Leu Met Met Pro Ala He Gly Val Ser aL Asn Arg Gly Asn Ala Ala 
Asp Gly Asn Glu Lys Phe Arg Val Leu Val Asp Ser Ala Asn Gin Asn 
Asn Leu Lys Asn Val Lys Glu Gin Tyr Gly Val His Jrp Asp Phe Ala 
Gly Glu Gly Phe Thr Thr Asn Met Asn Glu Lys §ln Phe Asn Ala Leu 
Gin Asn Asn Lys Asn Leu Thr Val Glu Lys Val Pro Glu Leu Glu n e 
Ala Thr Ala Thr Asn Lys Pro Glu Ala 1 Tyr Asn Ala Met Ala Ala 
Ser Gin Ser Thr Pro Trp Gly lie Jys Ala He Tyr Asn Asn Ser Asn 
Leu Thr Ser Thr Ser Gly Gly Ala Gly He Asn He Ala Val Leu Asp 
Thr Gly Val Asn Thr Asn His Pro Asp Leu Ser Asn Asn Val Glu Gin 



150 155 --— 

Cys Lys Asp Phe Thr Val Gly Thr Asn Phe Thr Asp Asn Ser Cys ^nr 

170 1"7C 

Asp Arg Gin Gly His Gly Thr His Val Ala Gly Ser Ala Leu HI Asn 
Gly Gly Thr Gly Ser Gly Val Tyr lly Val Ala Pro Glu 111 Asp Leu 
Trp Ala Tyr Lys Val Leu Gly Asp Asp Gly Ser Gly ?yr Ala Asp Asp 
lie Ala Glu Ala He Arg His Ala Gly Asp Gin Hi Thr Ala Leu Asn 
Thr Lys Val Val lie Asn Met Ser Leu Gly Ser Ser Gly Glu Ser Ser 
Leu He Thr Asn Ala Val Asp Tyr Ala Tyr Asp Lys Gly Val Leu He 
He Ala Ala Ala Gly Asn Ser Gly III Lys Pro Gly Ser He Gly Tyr 

280 ootr 

Pro Gly Ala Leu Val Asn Ala Val Ala Val Ala Ala Leu Glu Asn Thr 
295 300 



BNSDOCID: <WO 9856927A2_I_> 



PCT/US98/12005 

WO 98/56927 

Ile Gin Asn Gly Thr Tyr Arg Val Ala Asp Phe Ser Ser Arg Gly His 
L ° = Thr Ala Gly Asp Tyr Val Ile Gin ,ys Gly Asp val Glu lie Ser 
ua Pro Gly Ala A?a Val Tyr Ser Thr Trp Phe Asp Gly Gly Tyr Ala 
Th r Ile ser siy Thr ser Met Ala Ser Pro His Ala Ala Gly U» Ala 
M a , y s ire Trp Ala Gin Ser Pro Ala Ala Ser Asn Val Asp val Ar g 
oly £S Leu Gin Thr Arg Ala Ser val Asn ftjp He Leu ser Gly Asn 
SU Ala Gly ser Gly Asp Asp lie Ala Ser Gly Phe Gly Phe Ala L ys 
Val Gin 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS -. single 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
AGCTTGGCCT TAAGGGCCCG ATATCGGATC CGCGGCCGCT GCAGGTAC 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
CTGCAGCGGC CGCGGATCCG ATATCGGGCC CTTAAGGCCA 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
GCGGCCGCGA TTTCCAATGA G 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: 
GGTACCTGCA TTTGCCAGCA C 
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(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
GCTGCACTAT TGTCTTCTG 

19 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
CAGCAACTGC TACAATCTG 

19 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
GTGCAGGCTT ACAATGTACC AG 

22 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
GCATTTACCT GGCTCCAATG ATTC 

24 

(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
CCAATAGTAG AAGGACTG 

18 
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(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
-(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

CTTCAGATTG GAAAGCGAGC GGACGGAATC ATTGATC 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

1 

CTCAGCTTGA AGAAGTGA 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
GAAGCAGAGA GGCTATTG 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

1 

GAAAATATAG GGAAAATGT 
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Claims 

What is claimed is: 

1. An isolated nucleic acid sequence encoding a polypeptide having protease activity, 
selected from the group consisting of: 

(a) a nucleic acid sequence encoding a polypeptide having an amino acid sequence 
which has at least 95% identity with the amino acid sequence of SEQ ID NO:43; 

(b) a nucleic acid sequence encoding a polypeptide having an amino' acid sequence 
which has at least 85% identity with the amino acid sequence of SEQ ID NO:42; 

(c) a nucleic acid sequence having at least 95% homology with the mamre polypeptide 
encoding region of the nucleic acid sequence of SEQ ID NO:4 1 ; 

(d) an allelic variant of (a), (b), or (c); and 

(e) a subsequence of (a), (b), (c), or (d), wherein the subsequence encodes a polypeptide 
fragment which has protease activity. 



2. The nucleic acid sequence of claim 1, which encodes a polypeptide having z_ 
acid sequence which has at least 95% identity with the amino acid sequence of SEQ ID NO:43. 



I an amino 



20 3. 



The nucleic acid sequence of claim 1, which encodes a polypeptide having the 



acid sequence of SEQ ID NO:43, or a fragment thereof which has protease activity. 



amino 



4. The nucleic acid sequence of claim 3, which encodes a polypeptide having the amino 
acid sequence of SEQ ID NO:43. 

25 

5. The nucleic acid sequence of claim 2, wherein the nucleic acid sequence encodes a 
polypeptide having protease activity obtained from a Bacillus strain. 

6. The nucleic acid sequence of claim 1, which encodes a polypeptide having an amino 
3 o acid sequence which has at least 85% identity with the amino acid sequence of SEQ ID NO:42. 

7. The nucleic acid sequence of claim 1, which encodes a polypeptide having the amino 
acid sequence of SEQ ID NO:42, or a fragment thereof which has protease activity. 

35 8. The nucleic acid sequence of claim 7, which encodes a polypeptide having the amino 
acid sequence of SEQ ID NO:42. 
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9. The nucleic acid sequence of claim 6, wherein the nucleic acid sequence encodes a 
polypeptide having protease activity obtained from a Bacillus strain. 

5 10. The nucleic acid sequence of claim 1, which has at least 95% homology with the mature 
polypeptide encoding region of the nucleic acid sequence of SEQ ID NO:41. 

11. The nucleic acid sequence of claim 1, which has the nucleic acid sequence of SEQ ID 
NO:41. 

12. The nucleic acid sequence of claim 10, wherein the nucleic acid sequence encodes a 
polypeptide having protease activity obtained from a Bacillus strain. 

13. The nucleic acid sequence of claim 1, wherein the nucleic acid sequence encodes a 
15 polypeptide having protease activity obtained from a Bacillus strain NCIB 12513. 

14 The nucleic acid sequence of claim 1, which comprises the protease-encoding nucleic 
acid sequence contained in the plasmid P 170BAN which is contained in Bacillus subtilis LC20 
NRRLB-21680. 

15 A nucleic acid construct comprising the nucleic acid sequence of claim 1 operably 
linked to one or more control sequences which direct the production of the polypeptide in a 
suitable expression host. 

25 16. A recombinant expression vector comprising the nucleic acid construct of claim 15, a 
promoter, and transcriptional and translational stop signals. 

17. The vector of claim 1 6, further comprising a selectable marker. 

30 18. A recombinant host cell comprising one or more copies of the nucleic acid construct of 
claim 15. 

19. The cell of claim 18, wherein the nucleic acid construct is contained on a vector. 

35 20. The cell of claim 18, wherein the nucleic acid construct is integrated into the host cell 
genome. 
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21. The cell of claim 1 8, wherein the host cell is a bacterial cell. 

22. The cell of claim 21, wherein the bacterial cell is a Bacillus, Streptomyces, or 
5 Pseudomonas cell. 

23. The cell of claim 22, wherein the Bacillus cell is a Bacillus alkalophilus, Bacillus 
amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus coagulans, Bacillus firmus, 
Bacillus lautus, Bacillus lentus, Bacillus licheniformis, Bacillus megaterium, Bacillus pumilus, 

10 Bacillus stearothermophilus, Bacillus subtilis, or Bacillus thuringiensis strain 

24. A method for producing a polypeptide having protease activity comprising (a) 
cultivating the host cell of claim 18 under conditions suitable for the production of the 
polypeptide; and (b) recovering the polypeptide. 

15 

25. A method for producing a mutant of a cell, which comprises disrupting or deleting the 
nucleic acid sequence of claim 1 or a control sequence thereof, which results in the mutant 
producing less of the polypeptide than the cell. 

2 o 26. A mutant of a cell obtained by the method of claim 25. 

27. The mutant cell of claim 26, which further comprises one or more copies of a nucleic 
acid sequence encoding a heterologous protein. 

25 28. A method for producing a heterologous protein comprising 

(a) cultivating the mutant cell of claim 27 under conditions suitable for production of 
the protein; and 

(b) recovering the protein. 
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, 1 „.„,^ T7lr . . T r f-TCAAGCC A77GAAGAA77CGA AAAAAG7TATT ATTT A AA 75 
CTTAGGCAAGCTTTACTCTATACAGAGAT i ACAT^CTCAAG^ , 50 

AGAGGATAGGGGGTTAGACAGTAAATTAAAT.CGA,TM AGTCCAGACG AAT7GGTAGAATATCATTA 225 

tactcaatgtagaaaatggttagaaattggmAAATCtt^k 3Q0 

7TA777CACCA7T777GAC7A7G7CC.^ 

ACGTTTTTTTCAACAACAAGAT^AAAAGAACCAAm^ 4SQ 
TTGTTATAAATACAAAAAAGCAAGCTACTACTATTCG TGGAATTC7TTTTTT AGTAGGTAC TTT 525 

CTAAGaGSAGGeTAATATGAAAAAAAAACTGTTGCT.GTAG TACTGCTTTAAAAGATSAACATC CTGAGCC 600 
GGAJVAAATCTATTCAAGAGCCTCAAGTAATTGCACA,G-CGA TTTTTT - CTTTCATTTT7TTAGAA 67S 

GCTTCCAAATGGTTAAAAACAATAA^ 7S0 
AATATTGAATGGTCGCTGTAG , -I^^^*^ AAGT I 7 ; TSTTTC CTGTGATAAATTAATGATGTGTTATAAAT 825 
TGAAGCTATTTTAATCTGAAT i , ' C "^*^ AT c GAAGAGGG TrTTTTTATCCGTTTTATCAGTTGCTGCA 900 

tgagaggagttgagctatagaatgagaaagaaaggatcgaagag v f l s v l s . v a a 

, Jrr Ar-rCTTCTACTATTGGGGCGAACAATTTTGAATTGGACTTTAAGGGGATA 975 
CTATTGTCTTCTGTTGCTT.AAGCAG.CCTTCTACTATTG^ ^ N N F E L D F k G I 

U L S S V A L S t 5 = CAAGSAAAAAC GGGAAAGGCATCTTTTCTTGTAAACTC7GAA 1050 

GAGACACTTACGCTAGAGAAGGCTGCCACCAAGCAAGuAAAAAC G £ A gpLVNSE 

E 7 L 7 L E K A A T ^ AGTAGTTCCAGCGSA TAACAAGCTA7ATATCGT7CAA 1125 

AATG7GAAAATCCCAAAGAG i ATTCAAAAGAAACTAGAAG^ y-p A QNKLY IVQ 

N V K I P K S I Q K K ACTAGAGAAS ACGGGAGCGAAAAr7CTCGA77ACA7ACCA 1200 
T T7 G ACSGACC7AY777AGAGGAAACGCAAC7.CAACTA<^ « 7 G A K I L 0 Y- I P 

F D G P 1 L E . T ^rlTr°r C ATG7AAAG3CCG7AAC7AACGCAA77GCGCA777GSAA7CGG7T 1275 
GAT7ACGCT7A7A77G7CGAA7ATGA7G.GGATC7AAAGOC y T N A j A H ^ E 3 V 

0 Y A Y 1 V T^TA»AATAeACCCGCAA7TA7TTTCCAGAGGAGCTTCTGAATTAGTAGAAACAGTA 1350 
GAACCATATTTACCTTTATATAAAATAGACCCGCAA.7A.T fi R g A s £ L v E T v 

E P Y L P L )•* irTiAASAAGTACGTTTAAGAGGATTGGAACAAATTGCCCAATACGCGACAAAT 1425 
GCT77AGA7AAAAAGCAAAGAAGTAAAGAAG7ACG TT i AA G j_EQlAQYATN 

A L D K K 0 R S ^„ E J AT ^ GAAGTTTTGA A7GACG7 GG CCCGTGGCA77G7GAAAGCA 150O 
AA7GA7G7A7TA7ACGTAACCCCAAAG.-C.GAA7ACG~A 0VARG j ViCA 

N D V L Y V T P K P E Y E AGCAeTTGCTGATACTGGGC 77GA7ACA 1575 

6ACCTCQCACAAAATAACTTTCSC.TATA.eCACAAG»A ^ y A VADTGLD7 

□ v A Q N N F G U Y ^ J CTATATGCAC7eceCAGAAMAA T 16SO 

GGAAGAAA7GACAG77CGATGCATGAAGCA TCCGCGuTAA 7A LYALGRTN 

S R N 0 S S M H E A ^ ; ^ TflGA7C7GTGTTAGGA AATGCTACAAA7AAAGGGA7G 1725 
AACGCCAA7GA7CCAAA7GGACATGGAACCCA7G7.GC7G.A g y L G „ A T N K G „ 

N A N 0 P N fi H 0 T H V GAGGGCTGGSAG5A CTACCTGCTAATCTA 1800 

GCACCGCAAGCCAATCTAGTCTTTCAA.CTATTATGGA GGLGGLP ANL 

APQANLVFOS I n u , TACGAATTCAT3GSG gGC7CCAGTAAACGGT 1875 

ciAACAT-ATTCAGTCAAGCATATAGTSCTGGAGCGAGAA ,CA7ACGAA7,C ^ G A p y n . G 

QTLFSOAYS A ^ " |2_ A _ AAAAAA7GA7A7GA CGA77C7T7T7GCGGCCGSA 1950 
GCC7A7ACGACAGAC7C7CGAAA7G77GA7GA77A7G7GAGAAAAAA7G^ „ y , l p a a . - g 

A Y T T D S R N V 0 0 Y GCAAAAAA7GCGA77A CAGTTGGGGCAACCGAA 2Q25 

AATGAGSSACCAGGTAGCGGTACAATCAGTuwACCASG KNA , TVGATE 

NEG PGSGTI S A ^ ^ ATTCTCX7CACGAGGTCCTACT 2100 

AACC7ACG7CCAAGCTTCGGATCTTATGCG 3 A,AATA7TAACCATG77 a qfsSRGPT 

N L R P S F G S Y A 0 £ Q TATAXTCTCTCTGC 7AGA7CA7CA77AGC7CCA 2175 
AGAGA7GGACG7A77AAGCCGGACG i CATGuCACCAGG i A yj[_sARSSLAP 

R D G R I * P D V * TATe ecTACA7CGGTCS7ACT7CTATQSCTACTCCAA77G7A 22SO 

GATTCCTCATTC-GSGCAAACCATGAYAG.AAAYA.GCCACATGGG G7 5fl A7 piV 
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GCAGGTAA7G7TGCACAATTAAGGGAGCA77TTG7GAAAAATAGAGGGGTAACTCC7AAGCC7TCCC77T7AAAA 2325 

AGNVAQLREHFYKMRGVTPKPSLLK 
GCTGCTTTAATTGCAGGTGCTGCGGATGTTS5ACT7GGCTTTCCAAATGGTAACCAAGGATGGGGAAGAGTAACG 2400 

AAL1AGAADVGLGFPNGNQGWGRVT 
TTAGATAAATCCC7AAATGTCGCATTTGTGAATGAAACGAGCCCTTTATCAACAAGTCAAAAAGCAACATATTCG 2475 

LDKSLNVAFVN ETSPLSTSOKATYS 
TTTACGGCTCAAGCTGGTAAACCCTTAAAAATATCACTTGTTTGGTCAGATGCACCAGGTAGCACGACGGCATCA 2550 

FT AQAGKPLK I SLVWSOAPGSTTAS 
CTAACTTTAGTGAATGATTTAGACTTAGTAATCACTSCACCAAATGGAACTAAATACGTCGGAAATGACTTTACA 2625 

LT L VNDLOL V I T'APNG 7 KYVGNDFT 
GCACCGTATGATAACAATTGGGATGGCAGAAACAACGTGGAAAATGTGTTTATCAATGCTCCTCAAAGCGGAACG 2700 

APYDNNWDGRNNVENVF I NAPQSGT 
TATACAGTCGAAGTGCAGSCTTACAATGTACCAGTAAGTCCGCAAACC7TTTCTTTAGCGATTG7ACA7TAAAAT 2775 

YT VE VQAYNVPVSPOTFSLAIVH 
ATTGGAAGSAAGAG77GTTGATGAA7ATA7CAGCAGC7C77TT77TGA7TAAGCTC77TTCG7AAAGGTTG7TSC 2350 
TT7AAG7C3GTAAAAAG7CGGTArTTGGACTT777ACCAG7CATT7TGCTTGGGAAATTGA7GAGAG7ACT7TCA 2925 
T7ACTGA7GGAAAAGAGCACGA77GCAACGTT7A7GACGGGGTGAT77C7A777ACGAAAAGCAACAAAGTATGC 3000 



GAAA 30CW 
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Fig. 6A 
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Fig. 6B 

JP170 vs. subtilisin 

1 MRCTCGSKRVPLSVLSV&ALLSSVMjS SPSTIGANHFELDFKGIETLTLEKAATKQG 

57 KTGK^FLVNSENVKIPKS IQKKLEVVPADNKLYIVQFDGP ILEETQLQLEKTGAK 

igg^ggg ' ■Mil..,.! ,.. MKRSGKIFTTAMLAVTLM 

113 XIJDYIPDYAYrTF^GDVKAVTNTLIAHL 

169 ALDKKQRSKEVRLRGLEQ IAQ YATtJlTOVL YVTPKPEYEVLNDVARG IVKADVAQNN' 
76 KQFNALQNHKI^TVEKVPFJ^IATATira^ 

225 FGLYGQGQ ivavadtgldtgrnds smheafrgkitalyalgrtnnaiidpnghgthv 

J_j__ + I l-l I — I+++I-+I 1 1 1 1 1 

13 2 S TSGG^INIAVIJTGVlSTOmPDLS . MNVEOCKDFTVGTNFTDNSCTDRQG HGTHV 

281 AGSVLGN . AT . N . . KGMAFQANL . VFQSIMDSGGGLG . GLPANLQTLFSQAYSAGA 

I I | + | + | -+ | |+| | + | + | -+++-+- l + j + l -+— + ++— +1 

187 AGSAIiANGGTGSGVYGVAPEADLWAYKVLGDDGSGYADDIAEAIRHAG DQATALNT 

331 RIHTN . SWGAPVNGAYTTDSRNVDDYVRKNDMTILFAAGJSIEGPGSGTISAPGTAKN 
++ — | _ | _ | + __ +++ __ | ++ __ | _ | | + __ + _ + _ | + _ | | | f + , + 

mr iHBSI^SSGESSLim . . V . DYAYDKGVLIIAAAGNSGPKPGSIGYPGALVN 

386 AITVGATENLRPSFGSYADNIOTIVAQFSSRGPTRDGRIKPDVMAPGTYILSARSSL 

[ + j + | | + | | j j [ — J |+— | + | (__+_ 

ni f AVAV3 ^ENTIQN . GTY RVADFSSRGHKR . . TAGDYVTQKGDVE ISA . PGA 

442 APDSSFWANHDSICYAYMGGTSMATPIVAGNVAQLREHFVKNRGVT 
|___ l-l — | + _| |_ ++ J J | || + |- + | |- + | ++ 

ihm» AV - YSTW - • roGOT ATISGTSM^PH&AGIAAKIWAQSPAASNVDVRGELQ TRASV 

498 AGAADVGLGFPNGNOGWGKVTLDKSIjNVAFVNETSPLSTSQKA.TYSFTAQAGKPLK 

396 NDILSGNSAGSGDDIASGFG FAKVQ 

554 ISLVWSDAPGSTTASLTLVNDLDLVITAPNGTKYVGNDFTAP 

610 FINAPQSGTYTVEVQAYNVPVSPQTFSLAIVH " ~* ' * 
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